Data from: BioID2-based tau interactome reveals novel and known protein interactions associated with multiple cellular pathways
Data files
Sep 24, 2025 version files 10.94 GB
-
BioID2-HA-Replicate_1.raw
588.05 MB
-
BioID2-HA-Replicate_2.raw
600.60 MB
-
BioID2-HA-Replicate_3.raw
599.81 MB
-
BioID2-HA.mzid
7.35 MB
-
BioID2-HA.mzML
327.02 MB
-
BioID2-Tau-Replicate_1.raw
608.85 MB
-
BioID2-Tau-Replicate_2.raw
586.79 MB
-
BioID2-Tau-Replicate_3.raw
591.11 MB
-
BioID2-Tau.mzid
10.57 MB
-
BioID2-Tau.mzML
334.90 MB
-
LFQ_Analysis.mzid
71.12 MB
-
LFQ_Analysis.mzML
2.32 GB
-
LFQ_Analysis.mzTab
20.68 MB
-
MYC-BioID2-Replicate_1.raw
587.63 MB
-
MYC-BioID2-Replicate_2.raw
594.78 MB
-
MYC-BioID2-Replicate_3.raw
603.45 MB
-
MYC-BioID2.mzid
11.67 MB
-
MYC-BioID2.mzML
337.95 MB
-
README.md
4.89 KB
-
Tau-BioID2-Replicate_1.raw
595.38 MB
-
Tau-BioID2-Replicate_2.raw
597.04 MB
-
Tau-BioID2-Replicate_3.raw
596.63 MB
-
Tau-BioID2.mzid
11.41 MB
-
Tau-BioID2.mzML
336.68 MB
Abstract
Pathological inclusions composed of tau protein are hallmarks of neurodegenerative diseases collectively known as tauopathies, of which the most common is Alzheimer’s disease. Tau is known as a microtubule-associated protein involved in regulating microtubule dynamics, but accumulating evidence suggests tau is also involved in a multitude of biological functions regulated, in part, by direct and/or transient protein interactions. Deciphering the tau protein interactome is critical for better understanding the physiological and pathological roles of tau. This work aimed to identify potential tau-interacting partners using the in situ protein labeling biotin identification (BioID2) method. Fusion proteins were created between full-length human tau and BioID2 on the N-terminus (BioID2-Tau) or C-terminus (Tau-BioID2) of tau. Advantages of this approach include in-cell interactor labeling and enhanced likelihood of detecting transient and/or weak interactions. We report a total of 324 potential tau interactors by combining the list of proteins identified with BioID2-Tau and Tau-BioID2. This list included proteins found in the cytoskeleton, mitochondria, synapses, nucleus, and ribonucleoprotein complex. Gene ontology molecular function analysis identified RNA binding, translation regulation, ubiquitin ligase activity, kinase binding, and mitochondrial oxidoreductase. We validated the interaction between tau and selected candidates using two independent approaches; the proximity ligation assay and co-immunoprecipitation. We validated novel and known tau interactions with cytoskeletal proteins (MAP2 and MAP6), proteins associated with the nucleus (FUS and prune1), and proteins associated with the synapses (synapsin-1 and neurabin-2). Importantly, the *in situ *labeling approach revealed potential interactors that were not clearly identified by traditional approaches such as co-immunoprecipitation. Thus, this approach is a powerful tool to identify potential members of the tau interactome via in situ labeling. This work helps expand our understanding of tau’s potential functional roles, which may also advance our understanding of its role in neurodegenerative diseases.
https://doi.org/10.5061/dryad.280gb5mxj
The files contained here are the mass spectrometry (MS)-derived file sets used to generate the data reported in the manuscript. This study used the BioID2 (a biotin ligase proximity labeling) approach to identify proteins interacting with tau protein. Tau knockout primary neurons were transduced with lentiviruses to express two sets of proteins (one control construct and one related tau construct). The first set of constructs was a Myc-BioID2 alone control construct and the related BioID2-Tau fusion construct (BioID2 fused to the N-terminus of tau protein. The second set of constructs was a BioID2-HA alone tau construct and the related Tau-BioID2 fusion construct (BioID2 fused to the C-terminus of tau protein). Each construct condition included three independent experimental replicate samples. Biotin-labeled proteins were extracted from cell lysates and analyzed using mass spectrometry. RAW files were obtained using MS as described in the methods. The files include the MS/MS peak lists (.mzML), the peptide/protein ID files (.mzid), the label-free quantitation (LFQ) data (.mzTab), and the RAW MS files for each of the experimental replicates (.RAW). Protein ID and LFQ results were obtained by comparing each of the BioID2 tau fusion proteins to their respective controls (Tau-BioID2 versus BioID2-HA or BioID2-Tau versus Myc-BioID2). The .RAW files can be used in open source software, such as MetaMorpheus (https://github.com/smith-chem-wisc/MetaMorpheus).
NanoLC-MS/MS separations were performed with a Thermo Scientific Ultimate 3000 RSLCnano System. Top 20 data-dependent mass spectrometric analysis was performed with a Q Exactive HF-X Hybrid Quadrupole-Orbitrap Mass Spectrometer.
Description of the data and file structure
Data File Descriptions:
- MYC-BioID2.mzML: MS/MS peak lists for Myc-BioID2 control construct. Counting the peptide/protein matching spectra data.
- MYC-BioID2.mzid: Myc-BioID2 control construct results file. Reports the peptide/protein IDs
- BioID2-Tau.mzML: MS/MS peak lists for BioID2-Tau construct. Counting the peptide/protein matching spectra data.
- BioID2-Tau.mzid: BioID2-Tau construct results file. Reports the peptide/protein IDs
- BioID2-HA.mzML: MS/MS peak lists for BioID2-HA control construct. Counting the peptide/protein matching spectra data.
- BioID2-HA.mzid: BioID2-HA control construct results file. Reports the peptide/protein IDs
- Tau-BioID2.mzid: Tau-BioID2 construct results file. Reports the peptide/protein IDs
- Tau-BioID2.mzML: MS/MS peak lists for Tau-BioID2 construct. Counting the peptide/protein matching spectra data.
- LFQ_Analysis.mzML: MS/MS peak lists for the LFQ analysis. Counting the peptide/protein matching spectra data.
- LFQ_Analysis.mzid: LFQ analysis results files. Reports the peptide/protein IDs
- LFQ_Analysis.mzTab: Reports quantification data from LFQ analysis (available for LFQ only)
- MYC-BioID2-Replicate_1.raw: Replicate sample #1 of Myc-BioD2 control construct; mass spec file generated by the instrument. Thermo's proprietary format.
- MYC-BioID2-Replicate_2.raw: Replicate sample #2 of Myc-BioD2 control construct; mass spec file generated by the instrument. Thermo's proprietary format.
- MYC-BioID2-Replicate_3.raw: Replicate sample #3 of Myc-BioD2 control construct; mass spec file generated by the instrument. Thermo's proprietary format.
- BioID2-Tau-Replicate_1.raw: Replicate sample #1 of BioD2-Tau construct; mass spec file generated by the instrument. Thermo's proprietary format.
- BioID2-Tau-Replicate_2.raw: Replicate sample #2 of BioD2-Tau construct; mass spec file generated by the instrument. Thermo's proprietary format.
- BioID2-Tau-Replicate_3.raw: Replicate sample #3 of BioD2-Tau construct; mass spec file generated by the instrument. Thermo's proprietary format.
- BioID2-HA-Replicate_1.raw: Replicate sample #1 of BioD2-HA control construct; mass spec file generated by the instrument. Thermo's proprietary format.
- BioID2-HA-Replicate_2.raw: Replicate sample #2 of BioD2-HA control construct; mass spec file generated by the instrument. Thermo's proprietary format.
- BioID2-HA-Replicate_3.raw: Replicate sample #3 of BioD2-HA control construct; mass spec file generated by the instrument. Thermo's proprietary format.
- Tau-BioID2-Replicate_1.raw: Replicate sample #1 of Tau-BioD2 construct; mass spec file generated by the instrument. Thermo's proprietary format.
- Tau-BioID2-Replicate_2.raw: Replicate sample #2 of Tau-BioD2 construct; mass spec file generated by the instrument. Thermo's proprietary format.
- Tau-BioID2-Replicate_3.raw: Replicate sample #3 of Tau-BioD2 construct; mass spec file generated by the instrument. Thermo's proprietary format.
Animals
Human MAPT (tau) knock-in (MAPT KI) mice were obtained from Dr. Karen Duff at Columbia University with permission from the Saido group at Riken Center for Brain Science. Mouse tau knockout (TKO, strain #007251) mice were obtained from Jackson Labs. Homozygous breeding pairs were used to generate an in-house colony for both mouse lines, and timed-pregnant females were used to obtain fetuses for primary cortical neuron cultures. Animals were housed at Michigan State University Grand Rapids Research Center Vivarium in a 12 h light/dark cycle with access to food and water. All animal procedures were performed in accordance with the guidelines approved by the Michigan State University Institutional Animal Care and Use Committee.
BioID2 Constructs
The original BioID2 expression constructs, MCS-13X Linker-BioID2-HA (BioID2-HA, Addgene, #80899) and Myc-BioID2-13X Linker-MCS (Myc-BioID2, Addgene, #92308), were a kind gift from the Kyle Roux lab. The MCS-13X Linker-BioID2-HA plasmid was used to fuse the BioID2 to the C-terminus of human Tau (hT40, 2N4R, 441 amino acids) to create a construct consisting of Tau-13xLinker-BioID2-HA (referred to as Tau-BioID2) and the BioID2 control construct consisted of the 13xLinker-BioID2-HA (referred to as BioID2-HA). The mycBioID2-13xLinker-MCS plasmid was used to fuse BioID2 to the N-terminus of tau to create the Myc-BioID2-13xLinker-Tau (referred to as BioID2-Tau) construct and the BioID2 control construct consisted of the Myc-BioID2-13xLinker (referred to as Myc-BioID2). BioID2 constructs feature a 13xLinker which is a serine-glycine repeat sequence that acts as a flexible spacer sequence to increase the biotinylation range of the BioID2 protein. The BioID2 constructs were then cloned into the pFIN vector (Addgene, #44352). All plasmid constructs were validated by restriction digestion, and Sanger sequencing.
Lentiviral Production in HEK293T Cells
Lentiviruses were created to express Tau-BioID2, BioID2-Tau and the BioID2-HA and Myc-BioID2 controls. For each lentiviral preparation, 4 x 150 mm cell culture dishes (Corning, #430599) were plated at a density of 1 x 106 cells/dish in 25 ml complete DMEM. Culture dishes were incubated overnight in a humidified incubator. Two hours prior to transfection, DMEM was removed, and fresh DMEM was added. The following transfection mixture was then prepared for each 150 mm culture dish: 45 μg plasmid DNA (22.5 μg of the pFIN plasmid expressing the BioID2 protein of interest, 15 μg of pNHP lentiviral packaging vector (Addgene, #22500), 7.5 μg of pHEF-VSVG lentiviral envelope vector (Addgene, #22501), 300 μl polyethylenimine transfection reagent and 6 ml of 150 mM NaCl. The transfection mixture was incubated for 20 minutes at room temperature, slowly added to each culture dish, and then transfected HEK293T cells were maintained overnight in a humidified incubator. Complete DMEM was substituted by freshly prepared viral medium (98% DMEM, 1% FBS, and 1% penicillin-streptomycin) and cells were maintained for 48 hours. Medium containing lentiviruses was centrifuged at 675 x g for 5 minutes and the supernatant containing lentiviruses was filtered through a 0.45 μm filter. Lentiviruses were harvested by layering on 20% sucrose solution, and ultracentrifugation at 82,700 x g for 2 hours at 4°C using Sorvall™ WX+ ultracentrifuge (ThermoScientific, #75000100). The lentiviral pellets were resuspended in 500 μl sterile PBS, aliquoted, snap-frozen in crushed dry ice, and stored at -80°C until use.
Primary Cortical Neuron Cell Culture
Timed-pregnant female MAPT KI and TKO mice were euthanized by intraperitoneal injection of 100 mg/kg Fatal-Plus solution diluted in saline. Mouse pups were collected on embryonic day 18 (E18) and kept in ice-cold 0.9% saline. Fetal cortical tissues were isolated under a dissecting microscope, cut into small pieces, and collected in a tube containing ice-cold calcium- and magnesium-free solution (CMF; contains 1x DPBS, 1x amphotericin B (Gibco, #15290018), 1x gentamicin (Gibco, #15750060), and 10% glucose (Sigma-Aldrich, #G8270)).
Cortical tissue pieces were washed four times in CMF and incubated in 0.25% trypsin solution (Gibco, #15090046) for 15 minutes at 37 °C. Trypsin was removed and cortices were washed two times in CMF. Trypsin inactivation solution (3 ml) was added containing 2.1 ml Hank’s Balanced Salt Solution (Gibco#24020117), 0.6 ml newborn calf serum (Gibco#16010167), and 0.3 ml DNase solution (Worthington# LS002006). A homogenous cell suspension was obtained by gentle trituration of the tissue through a series of progressively smaller needles (30 x 14-gauge needle, 30 x 15-gauge needle, 20 x 16-gauge needle, 20 x 18-gauge needle, and 15 x 21-gauge needle). Cell suspensions were gently layered onto 5 ml sterile-filtered FBS and centrifuged at 200 x g for 5 minutes. Primary neuron cell pellets were gently resuspended in 1ml neurobasal media (NBM; Gibco, #21103049) supplemented with L-glutamine (2 mM, Gibco, #25030081), amphotericin B (2.5 μg/ml, Gibco, #15290026), B-27 Supplement (Gibco, #12587001), and gentamicin (50 μg/ml, Gibco, # 15710064). Cell counts were determined using Countess 3 automated cell counter (Invitrogen, #AMQAX2000). Primary cortical neurons were plated at density of 600,000 cells per well in a poly D-lysine-coated 6-well plate (Corning, #354413) and maintained in a humidified incubator at 37 °C and 5% CO2.
Primary Neuron Lentiviral Transduction
TKO primary cortical neurons were treated with lentiviruses to induce expression of BioID2-Tau, Tau-BioID2 or the respective Myc-BioID2 and BioID2-HA controls on DIV4. Lentiviruses were diluted in complete NBM, and primary neurons were transduced at an MOI of 200 (n=3 independent experiments). The neuronal culture was maintained for four days to allow lentiviral transgene expression before supplementing exogenous biotin (100 μM) into the medium. Lentivirus transduced neurons were maintained until collection for biochemical or immunofluorescence assays as described below.
Biotin-Streptavidin Affinity Pulldown
Ethanol-cleaned low retention microcentrifuge tubes (ThermoScientific, #3453) were placed in a magnetic separation stand. Streptavidin magnetic Dynabeads T1 (Invitrogen, #65602) were washed twice in 1 ml lysis buffer and then 1 mg of protein lysate containing biotinylated proteins (collected as described above) was added to the beads and rotated overnight at room temperature. The following day unbound supernatant proteins were collected as the post-pulldown sample and stored at -80°C until further analysis. The beads containing bound biotinylated proteins were washed three times in lysis buffer, then resuspended in 1 ml 50 mM Tris.HCl, pH 7.4. Then 900 μl of the resuspended beads was transferred to a new low retention tube for mass spectrometry analysis (described below). The remaining 100 μl was transferred to a new tube and was placed on a magnetic separation stand. The supernatant was discarded, and the beads were resuspended in an elution buffer containing 25 mM biotin prepared in lysis buffer. Efficient elution of biotinylated proteins was performed by competition with free biotin in the elution buffer and heating to 95°C for 15 minutes. The tube was placed on a magnetic separation stand, the supernatant containing biotinylated proteins was transferred to a new tube, and western blotting validation was performed as described above. Blots were incubated in IRDye® 680LT Streptavidin (1:5000, LI-COR Biosciences Cat#926-68031) overnight at 4°C. Blots were imaged using LI-COR Odyssey® infrared system and processed for publication using ImageStudio software (v5.2.5, LiCor Biosciences) and Adobe Illustrator 2023.
Nanoscale liquid chromatography coupled to tandem mass spectrometry (nano LC-MS/MS)
The tube containing the 900 μl of resuspended beads was placed on magnetic separation stand and the supernatant was discarded. The beads were washed six times in 25 mM ammonium bicarbonate (pH 8) and then resuspended in 150 μl of 25 mM ammonium bicarbonate -50% acetonitrile (ACN). On-bead protein digestion was performed by adding 100 ng of rlys-C (Promega, #V1671) and incubating for 90 minutes at 37°C then adding 3 μg trypsin (Promega, #V5280) and incubating for 16-18 hours at 37°C. The tubes were placed on a magnetic separation stand and the supernatant was collected in a new low retention tube. Samples were dried to completion in a speed vacuum centrifuge at 30 °C before being resuspended in 50 μl of 25 mM ammonium bicarbonate -5% ACN.
NanoLC-MS/MS separations were performed with a Thermo Scientific Ultimate 3000 RSLCnano System. Peptides were desalted in-line using a 3 μm diameter bead, C18 Acclaim PepMap trap column (75 μm × 20 mm) with 2% ACN, 0.1% formic acid (FA) for 5 min with a flow rate of 5 μl/min at 40°C. The trap column was then brought in line with a 2 μm diameter bead, C18 EASY-Spray column (75 μm × 250 mm) for analytical separation over 60 min with a flow rate of 350 nl/min at 40°C. The mobile phase consisted of 0.1% FA (buffer A) and 0.1% FA in ACN (buffer B). The separation gradient was as follows: 5 min desalting, 40 min 4–40% B, 2 min 40–65% B, 2 min 65–95% B, 7 min 95% B, 1 min 95–4% B, 3 min 4% B. One microliter of each sample was injected. Top 20 data-dependent mass spectrometric analysis was performed with a Q Exactive HF-X Hybrid Quadrupole-Orbitrap Mass Spectrometer. MS1 resolution was 60K at 200 m/z with a maximum injection time of 45 ms, AGC target of 3e6, and scan range of 300–1500 m/z. MS2 resolution was 30K at 200 m/z, with a maximum injection time of 54 ms, AGC target of 1e5, and isolation range of 1.3 m/z. HCD normalized collision energy was 28. Only ions with charge states from +2 to +6 were selected for fragmentation, and dynamic exclusion was set to 30 s. The electrospray voltage was 1.9 kV at a 2.0 mm tip to inlet distance. The ion capillary temperature was 280°C and the RF level was 55.0. All other parameters were set as default.
Mass Spectrometry Protein Identification
RAW mass spectrometry files were identified using Sequest HT against the reviewed Mus musculusUniprot proteome database (UP000000589, 25,285 unique sequences) with Thermo Scientific Proteome Discoverer software (version 2.5). Enzyme specificity was set to trypsin with an MS1 tolerance of 10 ppm and a fragment tolerance of 0.02 Da. Oxidation (M), biotinylation (K), acetylation (protein N-term), methionine loss (protein N-term), and biotinylation (protein N-term) were set as dynamic modifications. Peptide and protein false discovery rates (FDR) were 1% with threshold determined via decoy search using the Percolator algorithm. At least two peptide identifications were required per protein identification. All other parameters were set as default. Mass spectrometry RAW files from each experimental sample (n=3) or control sample (n=3) were analyzed by qualitative protein identification or LFQ (see details below). For the qualitative method, proteins identified in only one independent experimental replicate were excluded from further analysis, as were all keratins. To identify potential tau protein interacting partners, protein lists from Tau-BioID2 and BioID2-Tau experiments were compared to the respective controls BioID2-HA and Myc-BioID2. Proteins identified in the controls were removed and the curated Tau-BioID2 and BioID2-Tau lists were combined to generate the full list of potential tau interactors.
LFQ of Identified Proteins
LFQ was performed using the Precursor Ions Quantifier node. Precursor abundance was based on peak intensity. Protein abundance was normalized by total peptide amount. The protein abundance ratio was calculated using pairwise ratios, whereby the median peptide ratio is selected as the protein abundance ratio. The following criteria were used to define proteins as Tau-BioID2 or BioID2-Tau interactors: 1) being identified in at least two of the independent experimental replicates, and 2) being detected at ≥ 3-fold increase compared to the respective BioID2 control. Proteins identified in only one independent replicate as well as contaminant keratins were removed from further analysis. Data analysis output in .mzTab files contains values of "null" for numerical categories when there is an undetected measurement or for descriptive categories when those descriptors were not assigned.
Functional Protein Association Networks
Functional protein-protein interaction networks were visualized by the Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) analysis. STRING V12.0 was used to visualize the protein interactome network including both functional associations retrieved from published databases, text mining, computational prediction methods and physical interactions retrieved from experimental data (genetic, biochemical, and biophysical techniques) with a minimum required confidence interaction score set at 0.7 and above (high confidence interaction). STRING V12.0 was used to perform Markov Clustering (MCL) with inflation value of 3.0 to reduce the cluster size. STRING V12.0 was used to perform the Gene Ontology (GO) enrichment analysis and KEGG pathways analysis. GO cellular component, molecular function, biological process, and KEGG pathways were performed showing annotations with FDR ≤0.05.
Statistical Methods
GO cellular component and molecular function showed enriched annotations with FDR ≤ 0.05 and graphed as -Log10(FDR). Statistical tests and graphs were created using GraphPad Prism (V.10.0.2; GraphPad Software, La Jolla California USA).
Experimental Design and Statistical Rationale
Mass spectrometry identification of biotinylated proteins in BioID2 experiments was performed in three independent biological replicates per experimental or control condition. Experimental conditions included Tau-BioID2 and BioID2-Tau, while the controls were BioID2-HA and Myc-BioID2. Two control proteins were utilized because the position of the 13X linker as well as the tags used (Myc and HA) were different. For each experimental and control condition, peptides identified by mass spectrometry were filtered by 1% FDR and at least two peptides were required to identify a protein. For both the qualitative and LFQ analyses, proteins identified in less than two experimental replicates and two control replicates were omitted from further analysis. LFQ protein ratios were measured relative to the respective control (Tau-BioID2 versus BioID2-HA or BioID2-Tau versus Myc-BioID2). For the LFQ analysis, we used a cutoff abundance ratio of ≥ 3-fold. It is important to note that we are using the ratios as a cutoff for GO grouping and designating proteins for follow-up validation experiments to confirm interacting partners, and thus, did not consider the LFQ p-values. Validation of tau interacting partners using either co-IP or PLA were performed in three independent biological replicates per condition. Antibody isotype controls were used for co-IP experiments. PLA experimental controls included omission of each primary antibody and omission of all primary antibodies in MAPT KI neurons, as well as the full PLA reaction in TKO neurons. Lists of identified proteins were curated and filtered according to the defined criteria using RStudio (V.1.4.1717). Schematic figures were created with BioRender.com.
- Atwa, Ahmed; Alhadidy, Mohammed M.; Lamp, Jared et al. (2025). BioID2-Based Tau Interactome Reveals Novel and Known Protein Interactions Associated with Multiple Cellular Pathways. Journal of Proteome Research. https://doi.org/10.1021/acs.jproteome.5c00473
