Subcellular localization of over 600 Giardia lamblia proteins
Data files
Apr 01, 2026 version files 3.48 GB
-
2026_GiardiaGFP_images_database.zip
3.48 GB
-
README.md
2.31 KB
Abstract
Giardia intestinalis is a globally prevalent cause of waterborne diarrheal disease, yet about 40% of its proteome remains functionally uncharacterized due to the lack of conserved homologous proteins and limited experimental validation of protein function. To begin addressing this gap, we created a large-scale subcellular localization resource by fluorescently tagging and imaging 608 Giardia fusion proteins (12 % of the proteome) expressed in live cells from native promoters. This dataset includes 240 hypothetical proteins, 215 domain-family proteins (including ankyrin repeat and NEK kinase families), 171 diplomonad- or Giardia-specific proteins, 69 conserved eukaryotic proteins, and 77 proteins with known functions that were previously unlocalized. Imaging revealed localization to cytoskeletal and Giardia-specific organelles (eight flagella, the ventral disc, and the median body), along with novel components of the plasma membrane and endomembrane systems. Integrating localization data with domain architecture, homology, and Giardia-specific Gene Ontology terms, we produced a "localization-informed" gene annotation with a standardized, structured nomenclature. This resource provides the largest experimentally-validated functional annotation of the Giardia proteome to date, linking predicted gene models to cellular structures, creating testable hypotheses for protein function and establishing a durable framework for future studies of cell biology, pathogenesis, and eukaryotic evolution in this deeply divergent diplomonad lineage.
Dataset DOI: 10.5061/dryad.jm63xsjrn
Description of the data and file structure
We created a large-scale subcellular localization resource by fluorescently tagging and imaging 608 Giardia fusion proteins (12 % of the proteome) expressed in live cells from native promoters. This dataset includes 240 hypothetical proteins, 215 domain-family proteins (including ankyrin repeat and NEK kinase families), 171 diplomonad- or Giardia-specific proteins, 69 conserved eukaryotic proteins, and 77 proteins with known functions that were previously unlocalized.
Protein localization resource folder and nomenclature. The dataset contains comprehensive imaging data for all Giardia GFP-tagged strains, including new images of strains from cytoskeletal proteomes. Each strain (indicated by gene ID) has a corresponding folder with five files: a single raw (unprocessed) DIC TIFF image, a raw FITC (green) TIFF image stack, processed and cropped versions of both the DIC and FITC images, and a cropped PNG fluorescence image of one representative cell. The DIC and FITC images correspond to the same microscope field. In addition, single FITC images were created from FITC stack files, either by choosing the most representative slice or by performing a maximum or average intensity projection of in-focus slices. Raw file names include the date the image was taken, gene ID, image number, and whether it is the DIC or FITC file. Cropped and processed files are prepended with “cropped” and the single cell files are prepended with “one cell.” “MAX” indicates a maximum intensity projection. Note some folders do not have unprocessed DIC and FITC single slices saved separately. Rather, the entire stack combines DIC and FITC into the same file. Thus there may be some variation in the total number of files between strain folders.
Files and variables
File: 2026_GiardiaGFP_images_database.zip
Description: Protein localization resource folder and nomenclature. This zipped file contains comprehensive imaging data for the 608 GFP-tagged strains.
Code/software
Fiji, ImageJ or other open source applications that can read TIFF formatted images and stacks.
Giardia cultivation and electroporation. All strains used in this study were derived from Giardia lamblia WBC6 (ATCC 50803). Frozen stocks were thawed and cultured for at least 24 to 48 hours before phenotypic analysis unless otherwise specified. Routine cultivation was performed in 16 ml screw-cap tubes (BD Falcon) containing TYI-S-33 medium supplemented with 0.05 % ovine/bovine bile, 5 % adult bovine serum, and 5 % fetal bovine serum. Cultures were incubated at 37°C without agitation. Upon reaching confluence, tubes were chilled on ice for 30 minutes to detach trophozoites, and 0.5–1 ml of the resulting suspension was inoculated into 12 ml of pre-warmed fresh medium.
Electroporation of episomal plasmids was performed using previously established protocols. Briefly, 1 × 107 trophozoites were electroporated with 20–40 µg of DNA using a Bio-Rad GenePulser XL and 4mm cuvette at 375V, 700Ω, and 1000 μF. Electroporated cells were transferred to culture tubes and maintained at 37°C with medium replaced every 48 hours. Following electroporation, puromycin selection began at 10 µg/ml and was increased to 50 µg/ml when tubes were ≥ 50 % confluent. Cultures were maintained under selective pressure for at least 1 to 2 weeks prior to cryopreservation with 9 % DMSO. Alternatively, electroporated cells were immediately resuspended in 4 ml of TYI-S-33 medium and transferred to 6-well anaerobic culture plates. After 24 hours, the medium was aspirated and replaced, and puromycin was added to 15 µg/ml. Plates were incubated for 7–14 days at 37 °C in Mitsubishi AnaeroPack 2.5 L rectangular jars containing Mitsubishi AnaeroPack-Anaero Gas Generator sachets (Thermo Scientific). Medium was aspirated and replaced every 5 to 7 days. Upon visible outgrowth, puromycin was increased to 50 µg/ml, and confluent cultures were transferred to 16 ml screw-cap tubes with 12 ml medium for routine propagation.
Construction of C-terminal GFP and mNeonGreen (mNG) fusion strains. C-terminal GFP-tagged constructs were generated using Gateway cloning. PCR primers were designed to amplify full-length G. lamblia WBC6 open reading frames (excluding stop codons) plus about 200 bp of upstream promoter sequence. Amplifications were performed using either PfuTurbo Hotstart or Easy-A High-Fidelity PCR Master Mix (Agilent). PCR products were cloned into the entry vectors pENTR/D-TOPO or pCR8/GW/TOPO (Thermo Fisher Scientific) and confirmed by Sanger sequencing. Gateway LR recombination reactions were used to clone inserts into the destination vector pcGFP1Fpac (GenBank: MH048881), compatible with Giardia expression. Recombinant constructs were verified by AscI digestion, and plasmids were prepared using the Qiagen Plasmid Plus Midi Kit. mNeonGreen fusions were cloned into the plasmid pKS_mNeonGreen-N11_PAC.
Live imaging of fluorescently tagged strains. For live imaging, GFP- or mNG-expressing strains were thawed and cultured in TYI-S-33 medium for at least 24 hours. Cells were chilled on ice 30 minutes, centrifuged at 900 x g for 5 minutes, resuspended in warm medium, and seeded into MatTek dishes or black-walled 96-well glass-bottom imaging plates (Cellvis). Cells were incubated for 1 hour at 37 °C to promote attachment. Immediately prior to imaging, the culture medium was gently replaced with pre-warmed 1× HEPES-buffered saline (HBS). Additional HBS washes were applied as needed to remove loosely attached cells. Differential interference contrast (DIC) and epifluorescence images were acquired using a Leica DMI6000B microscope. Image processing was performed using FIJI (ImageJ v8).
