Data from: Tissue-specific mutagenesis from endogenous guanine damage is suppressed by PolK and DNA repair
Data files
Nov 19, 2025 version files 7.31 GB
Abstract
We report here LC-MS data using targeted adductomics and biochemical analyses to identify endogenous N2-dG lesions requiring Polk-mediated bypass, and untargeted adductomics to reveal new guanine lesions that engage NER. These findings uncover the nature of endogenous DNA damage and the coordinated roles of repair and tolerance pathways that limit mutagenesis in tissues.
Dataset DOI: 10.5061/dryad.r4xgxd2sp
Description of the data and file structure
Targeted adductomics data was collected to identify endogenous N2-dG lesions (guanine DNA adducts) which require Polk-meditated bypass. Untargeted adductomics was used to reveal new guanine lesions that engage NER. These findings uncover the nature of endogenous DNA damage and the coordinated roles of repair and tolerance pathways that limit mutagenesis in tissues.
The DNA adductomic screening for both the known endogenous and unknown putative DNA adducts was performed using the data in the Untargeted_LC-MS2MS3_Screening.zip folder. The file names are self-explanatory. The letters "WT" in the file name indicate "Wild-type" and these files served as negative controls and "Range" indicate different mass ranges which was done to breaking up the analysis into separate mass ranges to increase the sensitivity of the assay.
The quantitation of the endogenous DNA adducts presented in Figure 5b was performed using the data in Quantitative_Parallel_Reaction_Monitoring_(PRM)_of_Endogenous_Adducts.zip folder. The file names are self-explanatory.
The measurement of putative DNA adducts presented in Figure 5h was performed using the data in Parallel_Reaction_Monitoring_of_Putative_Adducts.zip folder. The files names are self-explanatory.
The data used for the generation of Supplementary Figures 11-17 are contained in the Data_used_for_supplementary_figures.zip folder. Data for Supplementary Figures 11 and 17 are in the DM01_65_Liver_XPC_combined_Range_4_606_607.raw file. Data for Supplementary Figures 12 and 16 are in the DM01_65_Liver_XPA_4_tMS3_updated_603_589.raw file. Data for Supplementary Figures 13 and 14 are in the DM01_65_Liver_XPA_1_tMS3_top10_FS_method3_595_602_580.raw file.
Code/software
Recommended Software for Data Analysis
Data was acquired on a Thermo Scientific Orbitrap Lumos mass spectrometer and therefore the files are the vendors .raw data format.
Primary Recommendation
Thermo Scientific Xcalibur FreeStyle: Designed for the analysis of Thermo Fisher instrument data files and used here for the quantitation of both known and putative DNA adducts.
Thermo Scientific Compound Discoverer 3.4: Designed to analyze Thermo Fisher data files (.raw) for Metabolomic and other screening assays and used here for the DNA adductomic data analysis.
Additional Software Options
- OpenChrom: A robust open-source alternative that supports a wide range of chromatography and mass spectrometry data formats. It can be used here as an alternative to FreeStyle for the quantitation of DNA adducts.
- MZMine 4: Designed primarily for data from Thermo Fisher instruments. This software may also support additional formats like .raw, offering flexibility for data analysis across different instrument platforms.
Note on Proprietary File Types and Equipment Used
- The dataset includes proprietary file type .raw which is specific to the Thermo Scientific instruments used in this study.
- While these proprietary files provide full transparency regarding data generation, they may require specific software for access, closely associated with the aforementioned equipment. For complete information on the analytical methods, instrument settings, and experimental procedures, please refer to the 'Materials and Methods' section of the dataset documentation. This section will provide additional context and guidance for understanding and utilizing these specialized file types effectively.
Purpose and Utility
- This file is a comprehensive resource for researchers aiming to conduct detailed analyses of agarwood samples.
- The structured and clearly labeled folders make navigating the dataset efficient, allowing for targeted analyses of specific samples or comparative studies across the collection.
Quantitative Parallel Reaction Monitoring (PRM) of Endogenous Adducts
Samples from wild type, Xpa -/-, and Xpc -/- liver and kidney DNA were reconstituted in 20 μL of H2O for LC-MS2 analysis targeting known endogenous DNA adducts. The analysis was done using an Orbitrap Exploris 480 instrument (ThermoFisher) coupled to a Vanquish™ Neo UHPLC system (Thermo Fisher) using positive nanoelectrospray ionization (NSI) with a source temperature of 300°C and a spray voltage of 1900V. The reversed-phase chromatographic separation was performed using a nanoflow column (50 cm x 75 μm, CoAnn Technologies, Richland, WA) self-packed with Luna C18 (5 μm,100 Å, Phenomenex) stationary phase. The mobile phases consisted of 5 mM NH4OAc (A) and 95 % CH3CN in H2O (B), and the injection volume was 4 μL. The gradient started with an increase from 1 % to 5 % B over 5 min at a flow rate of 0.3 μL/min, followed by an increase to 22 % B over 30 min. The gradient was then increased to 95 % B over 1 min, and the flow rate was increased to 0.9 μL/min. Finally, the gradient was maintained at 95 % B for 2 min and the flow rate was increased to 1.0 µL/min to wash the system for a total run time of 43 min. The column was re-equilibrated at the starting conditions with 5 column volumes to prepare for the next injection. This targeted method included the following precursor ions and corresponding extracted product ions used for quantitation: 338.1459 m/z → 222.0986 m/z for N2-propano-dG; 343.1311 m/z → 227.0837 m/z for [^15^N5]N2-propano-dG; 324.1302 m/z → 208.0829 m/z for N2-acrolein-dG; 339.1490 m/z → 218.0848 m/z for [^13^C10^15^N5]N2-acrolein-dG; 284.0989 m/z → 168.0516 m/z for 8-oxo-dG; 287.0964 m/z → 171.0490 m/z for [^13^C^15^N2]-8-oxo-dG; 292.1040 m/z → 176.0567 m/z for N2-etheno-dG and 297.1208 m/z → 176.0567 m/z for [^13^C5]N2-etheno-dG. MS2 fragmentation was performed with a quadrupole isolation width of 1.5 m/z, HCD collision energy of 30%, AGC value of 1000%, maximum injection time of 200 ms, and a resolution setting of 60,000. A 100-650 m/z full scan event with a resolution setting of 15,000 was included to monitor for any anomalies in sample composition or irregularities in the analysis. Calibration curves were prepared using standard solutions of the N2-acrolein-dG and N2-propano-dG ranging from 2.5 to 250 amol/μL with 100 amol/μL of the internal standards [^13^C10^15^N5]N2-acrolein-dG and [^15^N5]N2-propano-dG. In a separate calibration curve, a constant amount of the internal standard [^13^C^15^N2]-8-oxo-dG (1 fmol/μL) was mixed with different amounts of 8-oxo-dG (10–400 fmol/μL). Utilizing these calibration curves, we were able to absolutely quantify each of our adducts except for N2-etheno-dG which was not included in the calibration curve standard mix. Semi-quantitation of N2-etheno-dG was performed by assuming linear and equal response for N2-etheno-dG and [^13^C5]N2-etheno-dG in the sample data. Quantified adduct levels were all normalized to the measured dG amounts.
Parallel Reaction Monitoring of Putative Adducts
Samples from wild-type, Xpa -/- and Xpc -/- liver DNA were prepared for LC-MS2 analysis of putative DNA adducts detected in the screening assay using an Orbitrap Lumos instrument (ThermoFisher) coupled to a UHPLC system (Ultimate 3000 RSLCnano UHPLC, ThermoFisher) using positive NSI with the source temperature of 300°C and the spray voltage set to static at 2200V. The UHPLC was equipped with a 5 µL loop and reversed phase chromatographic separation was performed using a nanoflow column (50 cm x 75 μm, CoAnn Technologies, Richland, WA) self-packed with Luna C18 (5 μm,100 Å, Phenomenex). The mobile phases consisted of 5 mM NH4OAc (A) and 95 % CH3CN in H2O (B) and the injection volume was 4 μL. The gradient started at 1 % B for 20 min at a flow rate of 0.3 μL/min, followed by an increase to 5 % over 5 min, then an increase to 22 % over 35 min, followed by an increase to 95 % over 1 min and held at these conditions for 2 min. The gradient was then returned to 1 % B in 1 min and the column was re-equilibrated at this mobile phase composition for 3 min at a flow rate of 0.9 μL/min before the next injection for a full run time of 69 min. This targeted approach MS2 fragmentation (quadrupole isolation width of 1.5 m/z, HCD collision energy of 30 %, AGC value of 1000 %, maximum injection time of 200 ms, and Orbitrap resolution setting of 60,000) was performed on 33 m/z values: 249.093 m/z, 276.1343 m/z, 284.0338 m/z, 284.0746 m/z, 293.1167 m/z, 298.1147 m/z, 361.0637 m/z, 365.1013 m/z, 375.2234 m/z, 384.0895 m/z, 384.0975 m/z, 391.0741 m/z, 401.1119 m/z, 401.1128 m/z, 408.1085 m/z, 424.1021 m/z, 442.1132 m/z, 530.2819 m/z, 578.2565 m/z, 578.2565 m/z, 580.2092 m/z, 587.1613 m/z, 589.2773 m/z, 595.2197 m/z, 597.1568 m/z, 602.1562 m/z, 603.1555 m/z, 604.1405 m/z, 605.1959 m/z, 606.1998 m/z, 606.2908 m/z, 607.1010 m/z, and 607.2926 m/z. A 200-650 m/z full scan event (with a maximum injection time of 400 ms, an AGC value of 50 %, and an Orbitrap resolution setting of 15,000) was included to monitor for any anomalies in sample composition or irregularities in the analysis.
Untargeted LC-MS2/MS3 Screening
Samples from wild-type and Xpc -/- liver DNA were prepared in triplicates as described above. The samples were reconstituted in 20 μL of H2O, then all three 20 μL aliquots of the wild-type samples were combined into one vial. The same was done for the three 20 μL aliquots of the Xpc -/- samples. The analysis was done using an Orbitrap Lumos instrument (ThermoFisher) coupled to a UHPLC system (UltiMate 3000 RSLCnano UHPLC, ThermoFisher) using positive NSI with the source temperature at 300°C and the spray voltage at 2200V. The UHPLC was equipped with a 5 μL loop and reversed phase chromatographic separation was performed using a nanoflow column (50 cm x 75 μm ID, New Objective, Woburn, MA) self-packed with Luna C18 (5 μm,100 Å, Phenomenex). The mobile phases consisted of 5 mM NH4OAc (A) and 95 % CH3CN in H2O (B) and the injection volume was 4 μL. The gradient started at 1 % B for 20 min at a flow rate of 0.3 μL/min, followed by an increase to 5 % over 5 min, then an increase to 22 % over 35 min, followed by an increase from to 95 % over 1 min and held at these conditions for 2 min, the gradient was then returned to 1 % B in 1 min and the column was re-equilibrated at this mobile phase composition for 3 min at a flow rate of 0.9 μL/min before the next injection for a full run time of 69 min. The samples were injected four separate times with each injection using a different mass range (range 1: 145-288 m/z, range 2: 283-426 m/z, range 3: 421-564 m/z, and range 4: 559-702 m/z) with a maximum injection time of 250 ms, an AGC value of 1250 %, and a resolution setting of 120,000. Data dependent parameters included a mass tolerance of ± 5 ppm, a repeat count of 1, a dynamic exclusion of 5 s, a minimum intensity of 5.0e3, and a cycle time of 2 s. MS2 fragmentation involved a quadrupole isolation width of 1.5 m/z, stepped HCD collision energy of 15, 30, and 45 %, an AGC value of 1000 %, a maximum injection time of 100 ms, and a resolution setting of 15,000. MS2 product ions were isolated in the ion trap with a 2 m/z isolation window, and MS3 fragmentation was triggered upon observation of the neutral loss of 2′-deoxyribose (-dR; 116.0474 Da), the base moieties (-G; 151.0494 Da, -A; 135.0545 Da, -T; 126.0429 Da; -C; 111.0433 Da), or base moieties plus water (-G+H2O; 169.0646 Da, -A+H2O; 153.0651 Da, -T+H2O; 144.0535 Da, -C+H2O; 129.0538 Da).
LC-MS2/MS3 Data Analysis using Compound Discoverer (CD)
The data generated from the untargeted analysis on the Orbitrap Lumos instrument was imported into CD (ThermoFisher) which provides analyte identification, characterization and comparative analyses between sample groups. CD generated a list of all the potential compounds present in both the wild-type and Xpc -/- liver samples, the list consisted of a total of 65,108 potential compounds. Filters were then implemented based on the following criteria: peak area ratio of Xpc -/- over wild-type greater than 1.00, presence of guanine product ion in MS2 spectra (provided by the Compound Class node in CD), or neutral loss of either deoxyribose, guanine, cytosine, adenine, thymine, or any of those four base moieties plus water (provided by the Neutral Loss node in CD). With these filters implemented the list of potential compounds decreased to 285. All 285 compounds were then manually confirmed using Xcalibur Freestyle software (ThermoFisher). The manual confirmation resulted in 33 of the 285 showing a peak area ≥ 1.5 times higher in Xpc -/- than the wild-type sample, a Gaussian shaped peak, and similar retention times in both sample groups. The parameters used to generate the list of putative adducts are illustrated in Supplementary Figure 10 and listed in Supplementary Methods. Supplementary figures and methods are found in the associated journal article.
