Data and code for: Understanding the density maximum of water with machine learned potentials
Abstract
Overview
This repository contains the machine-learned potential, processed data, example scripts, and a representative snapshot used in the paper:
Understanding the Density Maximum of Water with Machine Learned Potentials
Song et al., Science Advances, 2026
All files described below are packaged in a single archive:
Data.zip— complete dataset containing all files and folders listed in this README.- Note: Full molecular dynamics (MD) trajectories are not included due to their large size. All files provided are sufficient to reproduce the main figures using the included model and scripts.
Directory Structure
After extracting Data.zip, the top-level directory is organized as follows:
Data/
├── models/ # Trained machine-learned interatomic potential models
│ ├── graph_PBE0-vdw.pb
│ ├── graph_PBE-vdw.pb
│ └── graph_PBE.pb
├── scripts/ # Example analysis scripts
│ ├── 01-write_INPUT.sh
│ └── 02-run.sh
├── voronoi_data/ # Voronoi-based structural analysis results
│ ├── PBE0-vdW-290K.tar.gz
│ ├── PBE0-vdW-310K.tar.gz
│ └── PBE0-vdW-330K.tar.gz
├── 1024-water-330K-snapshot.dump # Representative snapshot (LAMMPS format)
└── README.md
File and Folder Descriptions
1. models/
Trained machine-learned interatomic potential models used in the study. Models were trained with DeePMD-kit on ab initio reference datasets at different levels of DFT theory.
| File | Description |
|---|---|
graph_PBE0-vdw.pb |
Machine-learned potential trained on PBE0 + van der Waals (vdW) reference data |
graph_PBE-vdw.pb |
Machine-learned potential trained on PBE + vdW reference data |
graph_PBE.pb |
Machine-learned potential trained on PBE reference data (no vdW correction) |
2. 1024-water-330K-snapshot.dump
A representative atomic configuration of liquid water. The system size is 1024 water molecules, and the temperature is 330 K.
3. scripts/
Example scripts for reproducing the analysis workflow.
| File | Description |
|---|---|
01-write_INPUT.sh |
Generates D310 input files required for analysis |
02-run.sh |
Executes the analysis pipeline |
Dependencies:
- D310 (Candela): https://github.com/MCresearch/Candela
- Voro++ library: https://math.lbl.gov/voro++/doc/custom.html
These external tools must be installed and properly configured in your environment before running the scripts.
4. voronoi_data/
This folder contains structural analysis results derived from MD simulations using the PBE0-vdW model at selected temperatures:
| File | Description |
|---|---|
PBE0-vdW-290K.tar.gz |
Structural analysis results for PBE0-vdW at 290 K |
PBE0-vdW-310K.tar.gz |
Structural analysis results for PBE0-vdW at 310 K |
PBE0-vdW-330K.tar.gz |
Structural analysis results for PBE0-vdW at 330 K |
Each compressed archive contains processed data generated using D310 (Candela). The internal directory structure is identical for all temperatures. Taking PBE0-vdW-290K.tar.gz As an example, the extracted directory structure is:
290/
├── 0_pdf/ # Pair distribution function (PDF) calculations
│ ├── 1_pdf_OO/ # O-O pair distribution function
│ │ ├── d310.out # D310 output log
│ │ ├── INPUT # Input file
│ │ ├── result.dat # Calcualtion results
│ │ └── running0.log # Running log
│ ├── 2_pdf_OH/ # O-H pair distribution function
│ └── 3_pdf_HH/ # H-H pair distribution function
├── 1_hbs/ # Hydrogen bonds (HB) analysis
├── 2_voronoi/ # Voronoi cell analysis
├── 2b_voronoi/ # Additional Voronoi analysis
├── 2c_voronoi/ # Additional Voronoi analysis
├── 3_voronoi_postprocess/ # Voronoi data post-processing
├── 4_voronoi_postprocess2/ # Additional Voronoi data post-processing
├── 5_density/ # Density analysis
├── 6_hb_length/ # HB length analysis
├── 7_voronoi_pdf/ # Pair distribution function (PDF) of voronoi neighbors
├── 8_bdf_hb/ # Bond angle distribution function of HBs
└── 9_dist3/ # 3D distribution analysis of partically H-bonded water molecules around a central water molecule
The folder names correspond to the D310 (Candela) calculation functions used. For example, the 0_pdf/ The directory contains calculations using the pdf function, as specified in the first line of the corresponding INPUT file:
calculation pdf
A typical INPUT The file is shown below:
calculation pdf # Analysis function used in D310
system water # System type
geo_in_type LAMMPS # Input trajectory format
geo_directory /path/to/290.dump # Path to trajectory file
geo_1 1 # Starting frame index
geo_2 400000 # Ending frame index
geo_interval 10 # Sampling interval
geo_ignore 40000 # Number of initial frames ignored for equilibration
ntype 2 # Number of atomic species
natom 3072 # Total number of atoms
natom1 1024 # Number of atoms of type 1
natom2 2048 # Number of atoms of type 2
dr 0.001 # Radial grid spacing
id1 O # Atomic species labels
id2 H
ele1 O # Atomic pair analyzed for PDF
ele2 O
rcut 12.0 # Radial cutoff distance
In each calculation directory, the standard file structure includes:
| File | Description |
|---|---|
INPUT |
Input |
*.dat |
Results |
d310.out |
D310 output log |
running0.log |
Running log |
Additional details regarding D310 functions and input parameters can be found in:
- https://github.com/MCresearch/Candela
- https://github.com/MCresearch/Candela/blob/main/docs/Candela_manual.pdf
5. README.md
This file provides an overview of the repository contents.
Contact
For questions or additional data requests, please contact Yizhi Song at yizhi.song@unt.edu.
