Polyamination with spermidine enhances pathogenic tau conformations while reducing filamentous aggregate formation in vitro
Data files
Jun 18, 2025 version files 6.82 GB
-
3R_MetaMorph_Output.xlsx
1.41 MB
-
4R_MetaMorph_Output.xlsx
1.35 MB
-
README.md
59.88 KB
-
SPD_Modified_3R_Tau.raw
1.68 GB
-
SPD_Modified_4R_Tau.raw
1.72 GB
-
Unmodified_3R_Tau.raw
1.71 GB
-
Unmodified_4R_Tau.raw
1.71 GB
Abstract
Tau is subject to a broad range of post-translational modifications (PTMs) that regulate its biological activity in health and disease, including microtubule (MT) dynamics, aggregation, and adoption of pathogenic conformations. The most studied PTMs of tau are phosphorylation and acetylation; however, the salience of other PTMs is not fully explored. Tissue transglutaminase (TG) is an enzyme whose activity is elevated in Alzheimer’s disease (AD). TG action on tau may lead to intramolecular and intermolecular cross-linking along with the incorporation of cationic polyamines [e.g., spermidine (SPD)] onto glutamine residues (Q). Even though SPD levels are significantly elevated in AD, the effects of SPD polyamination on tau biology have yet to be examined. In this work, we describe a method to produce recombinant SPD-modified tau where SPD modifications are mainly localized to Q residues within the N-terminus. MT binding and polymerization assays showed that SPD modification does not significantly alter tau’s binding to MTs but increases MT polymerization kinetics. In addition, biochemical and biophysical assays showed that SPD polyamination of tau markedly reduces tau polymerization into filamentous and β-sheet containing aggregates. On the other hand, SPD modification promotes the formation of pathogenic conformations (e.g., oligomerization and misfolding) by tau with or without inducing tau polymerization. Taken together, these data suggest that SPD polyamination of tau enhances its ability to polymerize microtubules and favors the adoption of pathogenic tau conformations but not filamentous aggregates in vitro.
https://doi.org/10.5061/dryad.59zw3r2jp
Description of the data and file structure
These data are mass spectrometry RAW data files and full data sets from MetaMorpheus analysis output derived from analyzing recombinant human tau proteins (hT40 - 4R tau and hT39 - 3R tau) in E. coli that were either unmodified or modified by transglutaminase-mediated polyamination (with spermidine) in vitro. The recombinant proteins were purified using a series of chromatography approaches prior to and after polyamination reactions. The samples were digested with trypsin and rLysC and analyzed by mass spectrometry for spermidine modifications on Q residues.
Files and variables
File: SPD_Modified_3R_Tau.raw
Description: RAW data files for SPD-modified recombinant 3R tau protein
File: 4R_MetaMorph_Output.xlsx
Description: Full data set of protein groups, PSMs, quantified peptides, quantified peaks and sample codes from MetaMorpheus analysis output on 3R tau proteins. This file consists of individual tabs from the AllPSM, AllQuantifiedPeptides and AllQuantifiedPeaks file outputs from MetaMorpheus that combines the results from the individual sample output files (i.e. "230227-MA-4-calib" for unmodified 4R tau and "230227-MA-4-5-calib" for SPD modified 4R tau).
File: 3R_MetaMorph_Output.xlsx
Description: Full data set of protein groups, PSMs, quantified peptides, quantified peaks and sample codes from MetaMorpheus analysis output on 3R tau proteins. This file consists of individual tabs from the AllPSM, AllQuantifiedPeptides and AllQuantifiedPeaks file outputs from MetaMorpheus that combines the results from the individual sample output files (i.e. "230227-MA-3-calib" for unmodified 3R tau and "230227-MA-3-5-calib" for SPD modified 3R tau).
File: Unmodified_4R_Tau.raw
Description: RAW data files for unmodified recombinant 4R tau protein
File: SPD_Modified_4R_Tau.raw
Description: RAW data files for SPD-modified recombinant 4R tau protein
File: Unmodified_3R_Tau.raw
Description: RAW data files for unmodified recombinant 3R tau protein
Sample Code Tabs:
| Column Variable | Units | Description |
|---|---|---|
| Sample_code | name | Identification code for each sample analyzed |
| Recombinant tau protein | name | Name of tau protein that matches the sample codes |
Protein Groups Tabs:
| Column Variable | Units | Description |
|---|---|---|
| Protein Accession | alphanumeric | The accession number of the protein as specified in the protein database. |
| Gene | name | The gene name associated with the identified peptide’s parent protein. |
| Organism | name | Name of the organism for the identified protein |
| Protein Full Name | name | The full name of the peptide’s parent protein. |
| Protein Unmodified Mass | Daltons | Molecular weight of the identified full protein without modifications |
| Number of Proteins in Group | count | The number of proteins in the protein group. Multiple proteins are associated with a peptide identification when parsimony cannot distinguish between the options. |
| Unique Peptides | amino acid letters | Peptides that are unique to the listed protein (they can only come from that one protein, based on the database in silico digestion). Currently, peptides that are unique to the group are not listed here; i.e., a protein group with >1 protein will always have 0 unique peptides because they are shared between all proteins in the group. |
| Shared Peptides | amino acid letters | Peptides that are shared between multiple proteins in the protein database(s) used for the search are listed. |
| Number of Peptides | count | Number of unique+shared peptides observed that match to the specified protein group. |
| Number of Unique Peptides | count | Number of unique proteins for the protein group. See Unique Peptides definition. |
| Sequence Coverage Fraction | fraction | The fraction of amino acids in the protein observed in any peptide spectral match with a Q value <0.01. |
| Sequence Coverage | amino acid letters | Displays amino acids in the protein observed in any peptide spectral match with a Q value <0.01 for each protein in the group, with the “|” character as the delimiter. Lowercase residues were not observed. Uppercase residues were observed. |
| Sequence Coverage with Mods | amino acid letters | Displays amino acids, including post-translational modifications, in the protein observed in any peptide spectral match with a Q value <0.01 for each protein in the group, with the “|” character as the delimiter. Lowercase residues were not observed. Uppercase residues were observed. |
| Fragment Sequence Coverage | amino acid letters | Amino acid sequence of the protein that can be matched with the fragment sequence. Lowercase residues were not observed. Uppercase residues were observed. |
| Modification Info List | description | List of modifications identified |
| Intensity_2601_Glyco_01-calib | arbitrary units | When simultaneously searching multiple raw files, MetaMorpheus outputs the quantified intensity of the given peak across all files with each file having its own column, named Intensity_”filename”. The column named “Intensity_X” corresponds to the peak intensity obtained from file X |
| Intensity_2601_Glyco_02-calib | arbitrary units | When simultaneously searching multiple raw files, MetaMorpheus outputs the quantified intensity of the given peak across all files with each file having its own column, named Intensity_”filename”. The column named “Intensity_X” corresponds to the peak intensity obtained from file X |
| Number of PSMs | count | The number of peptide spectral matches below with a Q-Value <0.01 observed for all peptides assigned to the protein group. |
| Protein Decoy/Contaminant/Target | categorical | Each peptide spectral match, unique peptide and protein is assigned as decoy (D)/contaminant (C)/or target (T). The preference in assignment is D>C>T. |
| Protein Cumulative Target | numeric | The protein group of all target proteins matching below the given Q-Value. |
| Protein Cumulative Decoy | numeric | The protein group of all decoy proteins matching below the given Q-Value. |
| Protein QValue | numeric | The possibility of getting a decoy protein from a given protein set. |
| Best Peptide Score | numeric | The QValue Notch for the peptide in the protein group with the highest scoring peptide spectrum match. |
| Best Peptide Notch QValue | numeric | The MetaMorpheus Score of the peptide in the protein group with the highest scoring peptide spectrum match. |
**Note: **Within the file, “N/A” indicates not available and was inserted for any cells with missing values in the Metamorpheus output.
PSMs Tabs:
| Column Variable | Units | Description |
|---|---|---|
| File Name | name | The filename and path that contained the scan used in the identification. |
| Scan Number | numeric | The scan number is specified in the header of each scan. The scan number reported usually contains the MS2 data used in the peptide spectral match. It is possible for multiple co-isolated peptides to be matched to the same scan number. |
| Scan Retention Time | minutes | The experimental time that the scan was acquired. |
| Num Experimental Peaks | count | The number of experimental peaks (post-peak trimming) in the MS2 scan. |
| Total Ion Current | numeric | The total ion current of the MS2 spectrum. This is the sum of intensities from every MS2 peak. These intensities can come from fragmentation of multiple precursors depending on the selectivity for fragmentation (aka isolation width) and crowdedness of the MS1 spectrum. |
| Precursor Scan Number | numeric | The scan number of the most recent MS1 scan. |
| Precursor Charge | numeric | The charge of the isolated precursor peptide. |
| Precursor MZ | numeric (ratio) | The mass to charge of the isolated precursor peptide. This is not necessarily the selected MZ for isolation. |
| Precursor Mass | Daltons | The neutral (uncharged) mass of the peptide. |
| Score | numeric | MetaMorpheus score is incremented by one for each matching b- and y-ion. The number after the decimal is the fraction of total peak intensity from the MS2 scan that can be assigned to the particular peptide spectral match. |
| Delta Score | numeric | The MetaMorpheus score difference between the reported peptide and the next highest scoring peptide. If the next highest scoring peptide has the same score, both peptides are reported in the same row (ambiguity) and the next highest scoring peptide is used for the delta score. Thus, a delta score of 0 is not possible. |
| Notch | numeric | A narrow mass window in which the value is an allowed mass difference between the experimentally observed peptide and the best matching theoretical peptide. This is an arbitrary number that signifies the notch’s category. |
| Base Sequence | amino acid letters | The peptide amino acid sequence without modifications |
| Full Sequence | amino acid letters | The complete peptide sequence containing all variable and localized modifications. |
| Essential Sequence | amino acid letters | The full sequence containing only database-defined modifications and absent of fixed/variable modifications. |
| Ambiguity Level | numeric level (1-5) | Ambiguity level as defined by Smith et. al (PMID: 31451767; PMCID: PMC6857706; DOI: 10.1038/s41592-019-0573-x). This classification was originally designed for proteoform identifications, but is equally effective at communicating ambiguity in peptide identifications. |
| PSM Count (unambiguous, <0.01 q-value) | count | The number of peptide spectral matches below with a Q-Value <0.01 observed for all peptides assigned to the protein group. |
| Mods | name | The name(s) of the modification(s) on the peptide. |
| Mods Chemical Formulas | letters | The chemical formula(s) of the identified modification(s). |
| Mods Combined Chemical Formula | letters | The aggregated chemical formula of all identified modification. |
| Num Variable Mods | count | The number of variable modifications matched to a peptide. |
| Missed Cleavages | count | The number of missed enzyme cleavages |
| Peptide Monoisotopic Mass | Daltons | The mass of the peptide with the most abundant isotopes |
| Mass Diff (Da) | Daltons | The absolute mass difference between the observed and theoretical precursor mass. (Calculated as observed-theoretical). |
| Mass Diff (ppm) | ppm | The ppm mass difference between the observed and theoretical precursor mass. (Calculated as observed-theoretical). |
| Protein Accession | alphanumeric | The accession number of the protein as specified in the protein database. |
| Protein Name | name | The full name of the peptide’s parent protein. |
| Gene Name | name | The gene name associated with the identified peptide’s parent protein. |
| Organism Name | name | The database specified organism that the peptide’s parent protein originated from. |
| Identified Sequence Variations | N/A | If the search was conducted using a database containing annotated sequence variants, this column displays the sequence variant that is identified by the PSM or peptide. |
| Splice Sites | N/A | If the search was conducted using a database that contains annotated splice sites, this column contains splice sites which the PSM or peptide crossed. |
| Contaminant | categorical | Specifies if the peptide’s parent protein is from a contaminant database, “Y”, or not “N” |
| Decoy | categorical | Specifies if the peptide is a decoy peptide “Y”, or not “N”. |
| Peptide Description | description | A brief statement regarding the peptide’s digestion. |
| Start and End Residues In Protein | range | The one-based amino acid positions of the peptide in the parent protein(s). |
| Previous Amino Acid | amino acid letters | The amino acid in the protein preceding the specified peptide. |
| Next Amino Acid | amino acid letters | Amino acid in the protein that is next in line on the C-terminal end. |
| Theoreticals Searched | count | The number of theoretical peptides searched against the spectrum. This is only reported if e-Value calculations are specified in the search task. |
| Decoy/Contaminant/Target | categorical | Each peptide spectral match, unique peptide and protein is assigned as decoy (D)/contaminant (C)/or target (T). The preference in assignment is D>C>T. |
| Matched Ion Series | alphanumeric | The found product ions and their respective charges. |
| Matched Ion Mass-To-Charge Ratios | alphanumeric | The theoretical m/zs that were matched to the observed spectrum. |
| Matched Ion Mass Diff (Da) | Daltons | The absolute mass differences between the observed and theoretical product ion masses. (Calculated as observed-theoretical). Order can be found in Matched Ion Series |
| Matched Ion Mass Diff (Ppm) | ppm | The ppm mass differences between the observed and theoretical product ion masses. (Calculated as observed-theoretical). Order can be found in Matched Ion Series |
| Matched Ion Intensities | alphanumeric | The observed intensities for the matched product ions. |
| Matched Ion Counts | numeric | The number of product ions found for each series. |
| Normalized Spectral Angle | numeric | Normalized Spectral Angle is used to measure the similarity between two spectra. Two identical spectra will have a spectral angle of 1, whereas two completely different spectra will have a spectral angle of 0. |
| Localized Scores | N/A | If there is no ambiguity (only one peptide was assigned), then there is an attempt to localize the mass difference between the experimental and theoretical precursor masses. This mass difference is “placed” on each possible amino acid, and the resulting peptide score is calculated and reported in this column. Each reported score represents an amino acid (N-to-C) that the mass difference was “localized” to. |
| Improvement Possible | N/A | The increase in the MetaMorpheus score produced by localization of the modification to the position specified in the Full Sequence. |
| Cumulative Target | numeric rank | The target/decoy approach for determination of FDR yields lists of peptides and proteins matching either target or decoy. These commingled lists are sorted by score. The top scoring target match is labeled as 1. Each additional match to target is incremented by one. The total count of target matches scoring at or above a particular score at any point in the list is reported as Cumulative Target. Cumulative Decoy divided by Cumulative Target is the FDR. |
| Cumulative Decoy | numeric rank | The target/decoy approach for determination of FDR yields lists of peptides and proteins matching either target or decoy. These commingled lists are sorted by score. The top scoring decoy match is labeled as 1. Each additional match to decoy is incremented by one. The total count of decoy matches scoring at or above a particular score at any point in the list is reported as Cumulative Decoy. Cumulative Decoy divided by Cumulative Target is the FDR. |
| QValue | numeric | The q-value for the identification, calculated as the number of cumulative decoys (false positives) divided by the number of cumulative targets (true positives). |
| Cumulative Target Notch | count | The cumulative number of targets specific to the specified notch |
| Cumulative Decoy Notch | count | The cumulative number of decoys specific to the specified notch. |
| QValue Notch | numeric | The notch specific q-value for the identification, calculated as the number of cumulative notch decoys (false positives) divided by the number of cumulative notch targets (true positives). |
| PEP | numeric | Posterior Error Probabilities are calculated by a percolator-like gradient-boosted binary decision tree (PMID: 33683901; PMCID: PMC8377504; DOI: 10.1021/acs.jproteome.0c00838). This value represents the probability that a given spectral match is incorrect. |
| PEP_QValue | numeric | A traditional Q-Value calculation with results ranked by ascending PEP, as opposed to the Q-Value column which is ranked by descending MetaMorpheus score. This value represents the probability that a given result is wrong, in the set containing all matches with a PEP value less than or equal to the given result. |
Note: Within the file, “N/A” indicates not available and was inserted for any cells with missing values in the Metamorpheus output.
Quantified Peptides Tabs:
| Column Variable | Units | Description |
|---|---|---|
| Sequence | amino acid letters | Individual peptide sequences |
| Base Sequence | amino acid letters | The peptide amino acid sequence without modifications |
| Protein Groups | name | The name of the protein groups of identified peptides |
| Gene Names | name | The gene name associated with the identified peptide’s parent protein. |
| Organism | name | The database specified organism that the peptide’s parent protein originated from. |
| Intensity_2601_Glyco_01-calib | arbitrary units | When simultaneously searching multiple raw files, MetaMorpheus outputs the quantified intensity of the given peak across all files with each file having its own column, named Intensity_”filename”. The column named “Intensity_X” corresponds to the peak intensity obtained from file X. |
| Intensity_2601_Glyco_02-calib | arbitrary units | When simultaneously searching multiple raw files, MetaMorpheus outputs the quantified intensity of the given peak across all files with each file having its own column, named Intensity_”filename”. The column named “Intensity_X” corresponds to the peak intensity obtained from file X. |
| Detection Type_2601_Glyco_01-calib | name | When simultaneously searching multiple raw files, MetaMorpheus outputs the detection type of the given peak across all files with each file having its own column, named Detection_”filename”. The column named “Detection_X” corresponds to the detection method for the peak obtained from file X. |
| Detection Type_2601_Glyco_02-calib | name | When simultaneously searching multiple raw files, MetaMorpheus outputs the detection type of the given peak across all files with each file having its own column, named Detection_”filename”. The column named “Detection_X” corresponds to the detection method for the peak obtained from file X. |
Note: Within the file, “N/A” indicates not available and was inserted for any cells with missing values in the Metamorpheus output.
Quantified Peaks Tabs:
| Column Variable | Units | Description |
|---|---|---|
| File Name | N/A | The filename and path that contained the scan used in the identification. |
| Base Sequence | amino acid letters | Unmodified amino acid sequence of identified peptide |
| Full Sequence | amino acid letters | The complete peptide sequence containing all variable and localized modifications. |
| Protein Group | name | The name of the protein groups of identified peptides |
| Peptide Monoisotopic Mass | Daltons | The mass of the peptide calculated from atoms in their most abundant isotopic form (12C, 16O, 14N, etc.). This is the uncharged (neutral) mass. |
| MS2 Retention Time | minutes | The retention time at which the MS2 scan for a given peak was initiated/obtained. |
| Precursor Charge | charge value | The charge of the isolated precursor peptide. |
| Theoretical MZ | number | Mass obtained from the precursor monoisotopic mass divided by its deconvoluted charge. |
| Peak intensity | numeric | Intensity of the MS1 peak. The sum of intensity from each peak of the apex isotopic envelope. |
| Peak RT Start | minutes | The retention time at which a peak is first detected. |
| Peak RT Apex | minutes | The retention time at which a peak is most intense. |
| Peak RT End | minutes | The retention time at which a peak is last detected. |
| Peak MZ | numeric | The measured MS1 m/z for the peak. |
| Peak Charge | numeric (charge) | The measured charge at the peak. |
| Num Charge States Observed | count | The number of unique charge states a precursor peptide was found to exist as. Only MS1 evidence is required for an observation. |
| Peak Detection Type | name | Describes how a peak was detected for quantification. Typically, MSMS and MBR (match between runs). |
| MBR Score | N/A | The measurement to evaluate the MBR identification, calculated by getting the geometric mean four factors (ex. The distribution similarity between the anchor and donor peptide on their retention time, mass error, matched scan and intensity). The score from 1 to 100, higher scores are better. |
| PSMs Mapped | numeric | Number of MS2 PSMs that mapped to the precursor peptide. This correlates to the number of peptide fragmentation events seen for the given precursor. |
| Base Sequences Mapped | numeric | The base sequence of the protein that was matched with the fragments in the experiment. |
| Full Sequences Mapped | numeric | The full sequence of the protein that was matched with the fragments in the experiment. |
| Peak Split Valley RT | numeric | The retention time of the valley separating the current peak from the next closest peak. |
| Peak Apex Mass Error (ppm) | ppm | The difference between the theoretical mass and the precursor monoisotopic mass. |
Note: Within the file, “N/A” indicates not available and was inserted for any cells with missing values in the Metamorpheus output.
Code/software
RAW data files were analyzed with the MetaMorpheus software version 1.0.1 developed by the Smith laboratory (Miller et al., 2023). For hT40 proteins, the following databases were downloaded from Uniprot (November 2021) and used for analysis: Escherichia coli (strain K12) (UP000000625), trypsin (Q29463), Lys-C (Q02SZ7), and full-length tau sequence (2N4R isoform, P10636-8). The same files were used to analyze the hT39 proteins using with 2N3R tau isoform sequence (P10636-5) instead of the 2N4R sequence. Mass shifts corresponding to the non-acetylated SPD were used to search for modifications: +128.1313485 for SPD (Schopfer et al., 2024; Yu et al., 2015). In addition, the fragmentation pattern of SPD was determined by running SPD alone on MS. Mass-to-charge-ratios (m/z) corresponding to diagnostic ions (DIs) were identified: 54.048, 57.059, 71.075, 111.109, 128.132. The search parameters for the SPD modification included both mass shift and the identified diagnostic ions.
The analysis sequence included mass calibration, global post-translational modification discovery (G-PTM-D) (Li et al., 2017), and a classic search. Mass calibration was conducted using the following criteria: protease = trypsin; maximum missed cleavages = 2; minimum peptide length = 7; maximum peptide length = unspecified; initiator methionine behavior = variable; variable modifications = Oxidation on M; max mods per peptide = 2; max modification isoforms = 1024; precursor mass tolerance = ±15.0000 ppm; product mass tolerance = ±25.0000 ppm. The criteria utilized for G-PTM-D were protease = trypsin; maximum missed cleavages = 2; minimum peptide length = 7; maximum peptide length = unspecified; initiator methionine behavior = Variable; max modification isoforms = 1024; variable modifications = Oxidation on M; G-PTM-D modifications count = 3; precursor mass tolerance(s) = ±5.0000 ppm around 0 ,128.131348525 Da; product mass tolerance = ±20.0000 ppm. Finally, a classic search was conducted using the following criteria: protease = trypsin; search for truncated proteins and proteolysis products = false; maximum missed cleavages = 2; minimum peptide length = 7; maximum peptide length = unspecified; initiator methionine behavior = variable; variable modifications = Oxidation on M; max mods per peptide = 2; max modification isoforms = 1024; precursor mass tolerance = ±5.0000 ppm; product mass tolerance = ±20.0000 ppm; report peptide spectral match (PSM) ambiguity = true. SPD polyamination site of tau detected at a false discovery rate of 1% are reported (Supplementary table S1). Supplementary table S2 demonstrates all quantified tau peptides in unmodified vs SPD-modified tau samples. Supplementary table S3 shows the quantified peaks of tau with their corresponding peptide masses, theoretical and observed m/z, retention time, and PSMs. MetaDraw version 1.0.5 was utilized to review the PSMs of modified and unmodified tau peptides (samples of these peptides are included in Figures S2 and S3). Full proteomics data sets and .RAW files from mass spectrometry are available here.
Access information
Other publicly accessible locations of the data:
- N/A
Data was derived from the following sources:
- N/A
Preparation of recombinant unmodified and SPD-modified tau proteins
Recombinant hT40 and hT39 tau proteins were prepared from a 4L bacterial culture as described previously (Combs et al., 2017), with the exception that BL21 bacteria (NEB, #C2527H) were used. The concentration of recombinant tau protein was determined using the BCA method (Thermo, # A53225). Next, polyamination reactions were performed in vitro by adapting the protocol described by Song and coworkers (Song et al., 2013). Briefly, 8 mM of SPD (Sigma, # S2626-1G) and 0.2 µM of TG enzyme (Sigma, # TS398) were added to 0.93 mg/ml of tau in 50 µl of reaction buffer (50mM Tris HCl, pH 8, 10 mM calcium chloride and 5 mM DTT) followed by a 1-hour incubation at 37 °C. Unmodified tau proteins were subjected to the same reaction conditions, but TG was excluded. Then, the TG enzyme was inactivated by heating at 70 °C for 2 min. The TG-catalyzed reaction produces SPD-modified monomeric proteins as well as intra- and inter-molecular crosslinked proteins. To remove inter-molecular crosslinked tau (i.e. high molecular weight species), SPD-modified tau was further purified by passing the sample through 100 kDa MWCO Amicon filter (0.5 ml; Millipore, # UFC510096) (Figure S1). Unmodified tau proteins were subjected to the same process. The flow-through (FL) was collected for further purification. Each concentrator membrane was washed 6 times with 400 µl buffer A (500 mM NaCl, 10 mM Tris, and 5 mM Imidazole pH 8). All the FL samples were pooled and subsequently buffer exchanged into Buffer A using a HiPrep 26/10 desalting column (Cytiva, #17508701) and fast protein liquid chromatography (FPLC). The desalting column was equilibrated with 5 column volumes (CVs) of buffer A, then protein samples were run over the column in 5 CVs of buffer A at a flow rate of 5 ml/min, and fractions containing tau were collected. Finally, to purify the tau proteins (6x His tagged) from other proteins in the polyamination reaction (e.g. TG) and free SPD, heavy metal affinity chromatography was used using a 5 ml HiTrap Talon crude column (Cytiva, #28953767). The HiTrap Talon column was equilibrated in 5 CVs of buffer A, the proteins were run over the column, the column was washed with 5 CVs of buffer A, and then tau proteins were eluted in 10 CVs of buffer B (100 mM imidazole in buffer A, pH 8, supplemented with 200 µM PMSF) at a flow rate of 3 ml/min in 5 ml fractions. Fractions containing highly purified monomeric tau (determined using SDS-PAGE) were pooled and concentrated to 2-4 mg/ml using Amicon® Ultra Centrifugal Filter, 30 kDa MWCO (Millipore, # UFC903008), then DTT was added (final concentration of 1 mM). The unmodified hT40 (hT40) SPD-hT40, unmodified hT39 (hT39)), and SPD-hT39 samples were then aliquoted and frozen at -80 °C. The final concentration of recombinant tau proteins was determined using the SDS-Lowry method as described previously (Combs et al., 2017).
Preparation of recombinant tau protein for tandem mass spectrometry
The hT40, SPD-hT40, hT39, and SPD-hT39 proteins were digested using a combination of trypsin (Promega, #V5280) and rLysC (Promega, #V167A). First, each protein (3 µg) was subjected to 5 rounds of buffer exchange with 25 mM ammonium bicarbonate (AmBic) pH 8 using a 0.5 ml 3K MWCO Amicon centrifugal filter (15,000 x g for 10 min; Millipore, # UFC500396). Then, the tau proteins were retrieved by inverting the filter into a recovery tube, centrifugation at 15,000 x g for 2 min, and then vacuum drying using Vacufuge. The dried pellets were reconstituted in 50 µl of digestion buffer [12.5 mM AmBic, pH 8 + 50% acetonitrile (ACN)] and incubated at 37 °C for 90 min with rLysC (150 ng of enzyme per 3 μg of recombinant protein). Then, trypsin was added (300 ng of enzyme per 3 μg of recombinant protein) and incubated at 37 °C for 16-18 hours. The following day, digested protein samples were vacuum dried and stored at -20 °C until running on mass spectrometry (MS).
Tandem MS of recombinant tau proteins
We used a Thermo Scientific Ultimate 3000 RSLCnano System coupled with nanoscale liquid chromatography. Desalting of digested peptides was conducted in-line using a 3 μm diameter bead, C18 Acclaim PepMap trap column (75 μm × 20 mm) with 2% ACN, 0.1% formic acid (FA) for 5 min at a flow rate of 2 μl/min at 40 °C. The trap column was then brought in line with a 2 μm diameter bead, C18 EASY-Spray column (75 μm × 250 mm) for analytical separation over 128 min with a flow rate of 350 nl/min at 40 °C. The mobile phase included two buffers: 0.1% FA (Buffer A) and 0.1% FA in ACN (Buffer B), and a gradient was used for separation as follows: 12.5 min desalting, 95 min 4–40% B, 2 min 40–65% B, 3 min 65–95% B, 11 min 95% B, 1 min 95–4% B, 3 min 4% B. We injected 1 μg of each sample for analysis. Top 20 data-dependent mass spectrometric analysis was performed with a Q Exactive HF-X Hybrid Quadrupole-Orbitrap Mass Spectrometer. MS1 resolution was 60K at 200 m/z with a maximum injection time of 45 ms, AGC target of 3e6, and scan range of 300–1500 m/z. MS2 resolution was 30K at 200 m/z, with a maximum injection time of 54 ms, AGC target of 1e5, and isolation range of 1.3 m/z. High-energy collision dissociation (HCD) normalized collision energy was 28. Only ions with charge states from +2 to +6 were selected for fragmentation, and dynamic exclusion was set to 30 s. The electrospray voltage was 1.9 kV at a 2.0 mm tip-to-inlet distance. The ion capillary temperature was 280 °C and the RF level was 55.0. All other parameters were set as default.
MS data analysis to determine SPD modification sites
RAW data files were analyzed with the MetaMorpheus software version 1.0.1 developed by the Smith laboratory (Miller et al., 2023). For hT40 proteins, the following databases were downloaded from Uniprot (November 2021) and used for analysis: Escherichia coli (strain K12) (UP000000625), trypsin (Q29463), Lys-C (Q02SZ7), and full-length tau sequence (2N4R isoform, P10636-8). The same files were used to analyze the hT39 proteins using with 2N3R tau isoform sequence (P10636-5) instead of the 2N4R sequence. Mass shifts corresponding to the non-acetylated SPD were used to search for modifications: +128.1313485 for SPD (Schopfer et al., 2024; Yu et al., 2015). In addition, the fragmentation pattern of SPD was determined by running SPD alone on MS. Mass-to-charge-ratios (m/z) corresponding to diagnostic ions (DIs) were identified: 54.048, 57.059, 71.075, 111.109, 128.132. The search parameters for the SPD modification included both mass shift and the identified diagnostic ions.
The analysis sequence included mass calibration, global post-translational modification discovery (G-PTM-D) (Li et al., 2017), and a classic search. Mass calibration was conducted using the following criteria: protease = trypsin; maximum missed cleavages = 2; minimum peptide length = 7; maximum peptide length = unspecified; initiator methionine behavior = variable; variable modifications = Oxidation on M; max mods per peptide = 2; max modification isoforms = 1024; precursor mass tolerance = ±15.0000 ppm; product mass tolerance = ±25.0000 ppm. The criteria utilized for G-PTM-D were protease = trypsin; maximum missed cleavages = 2; minimum peptide length = 7; maximum peptide length = unspecified; initiator methionine behavior = Variable; max modification isoforms = 1024; variable modifications = Oxidation on M; G-PTM-D modifications count = 3; precursor mass tolerance(s) = ±5.0000 ppm around 0 ,128.131348525 Da; product mass tolerance = ±20.0000 ppm. Finally, a classic search was conducted using the following criteria: protease = trypsin; search for truncated proteins and proteolysis products = false; maximum missed cleavages = 2; minimum peptide length = 7; maximum peptide length = unspecified; initiator methionine behavior = variable; variable modifications = Oxidation on M; max mods per peptide = 2; max modification isoforms = 1024; precursor mass tolerance = ±5.0000 ppm; product mass tolerance = ±20.0000 ppm; report peptide spectral match (PSM) ambiguity = true. SPD polyamination site of tau detected at a false discovery rate of 1% are reported (Supplementary table S1). Supplementary table S2 demonstrates all quantified tau peptides in unmodified vs SPD-modified tau samples. Supplementary table S3 shows the quantified peaks of tau with their corresponding peptide masses, theoretical and observed m/z, retention time, and PSMs. MetaDraw version 1.0.5 was utilized to review the PSMs of modified and unmodified tau peptides (samples of these peptides are included in Figures S2 and S3). The .RAW MS files for each protein analysis are located here.
