Skip to main content
Dryad

Data from: Application of a metabolic network-based graph neural network for the identification of toxicant-induced perturbations

Data files

Jun 09, 2025 version files 65.27 MB

Abstract

This study applies a graph neural network (GNN)-based approach to investigate metabolic perturbations in mouse liver transcriptomic data following toxicant exposure. A mouse-specific metabolic reaction network was constructed from Reactome, replacing the human network used in prior models. Publicly available transcriptomic datasets (n = 7,903 control samples across 26 tissues) were curated from Recount3 for model training and validation. Test datasets (n = 299) included liver samples from mice exposed to the environmental toxicant 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD).

Gene counts were filtered to retain only those linked to known metabolic reactions and transformed using DESeq2. Principal component analysis (PCA) was applied to genes per reaction, with the first principal component (PC1) used as node features. A GNN architecture using PyTorch Geometric with GraphConv layers and global mean pooling was trained to classify tissue type and later adapted via transfer learning for toxicant response classification.

Integrated Gradients were used to estimate the importance of individual edges in the reaction network, and network centrality measures identified key reactions. Comparative differential gene expression and enrichment analyses were performed to contextualize GNN findings. All data were obtained from public sources and all code is available on GitHub.