Skip to main content

FASTA file of sequences of identified proteins in Anastrepha ludens reproductive tissues


Sirot, Laura (2022), FASTA file of sequences of identified proteins in Anastrepha ludens reproductive tissues, Dryad, Dataset,


Seminal fluid proteins (Sfps) modify female phenotypes and have wide-ranging evolutionary implications on fitness in many insects. However, in the Mexican fruit fly, Anastrepha ludens, a highly destructive agricultural pest, the functions of Sfps are still largely unknown. To gain insights into female phenotypes regulated by Sfps, we used nano liquid chromatography mass spectrometry to conduct a proteomic analysis of the soluble proteins from reproductive organs of A. ludens. The proteins predicted to be transferred from males to females during copulation were 100 proteins from the accessory glands, 69 from the testes, and 20 from the ejaculatory bulb, resulting in 141 unique proteins after accounting for redundancies from multiple tissues. These 141 included orthologs to Drosophila melanogaster proteins involved mainly in oogenesis, spermatogenesis, immune response, lifespan, and fecundity. In particular, we found one protein associated with female olfactory response to repellent stimuli (Scribble), and two related to memory formation (aPKC, Shibire). Together, these results raise the possibility that A. ludens Sfps could play a role in regulating female olfactory responses and memory formation and could be indicative of novel evolutionary functions in this important agricultural pest.


Experimental design

The proteomic analysis was performed using reproductive tissues from males and females. We focused on three tissues from males (MAG, testes, and ejaculatory bulb, Fig. 1a), whereas we used the lower reproductive tracts (without the ovaries and accessory glands) of females that were either mated or unmated (Fig. 1b). The method where non-chemically marked organs from both females and males are used have been used in other proteomic studies with success [13]. We analyzed a total of two replicates for MAG and reproductive tracts of unmated and mated females, and one replicate for the testes and ejaculatory bulb tissues. Each replicate (i.e., experimental unit) was comprised of tissues from 30 insects.

Rearing and mating of flies

Mass-reared pupae were obtained from Moscafrut, Metapa de Dominguez, Chiapas, Mexico. Adults were separated by sex after emergence and kept in controlled conditions (12:12 h light: darkness cycle, 25 ± 2°C, 60 ± 10% RH). Three days after the emergence of the adults, males and females were separated and placed in cages, and fed sucrose and yeast hydrolysate (3:1 ratio). Matings were observed when adults were 15 days of age. In this species, males and females reach reproductive maturity at 10-14 days after emerging depending on their nutritional history [14, 15].

Dissection of reproductive tracts

Females were dissected within 10 min after mating ended, on a glass slide with wells filled with a buffer solution (10X PBS, 10% SDS & 7X Complete, Mini, EDTA-free Protease Inhibitor Tablet; Roche, Mannheim, Germany; every component was diluted to a 1X working solution) at 14 ± 2°C. Dissection and isolation of reproductive tracts were performed using a stereo microscope (SZX7 zoom; Olympus, Tokyo, Japan). To minimize protein degradation, dissections were carried out on cold gel blocks with sterilized tweezers (forceps # 5; Zwiss, Rubis, Switzerland).

Protein extraction

The tissues of A. ludens were processed and protein quantity was assessed using the BCA Assay Kit (Pierce, Rockford, IL). In brief, tissues were homogenized with 3 mL lysis buffer containing 150 mM Tris pH (8.2), 1% SDS, and 1 mM phenylmethylsulfonyl fluoride. Subsequently, the mixture was centrifuged at 10,000×g for 15 minutes, and the supernatant was recovered and stored at -80°C for further analysis. To remove the majority of sperm proteins from the analysis, the proteins that aggregated in the pellet were not analyzed. Thus, the majority of the identified proteins would be soluble seminal fluid proteins.

Protein digestion

A total of 50 µg of protein extract for all tissues was reduced with 10 mM Tris (2-carboxyethyl) phosphine for 45 min at 60°C, alkylated with 30 mM iodoacetamide for 1 h at 25°C and darkness, and then quenched the remaining of Iodoacetamide with 30 mM DTT for 10 min. Then, we precipitated proteins with four volumes of acetone at -20°C for 15 h. Proteins were then recovered by centrifugation at 3000×g for 30 min at 4°C. The supernatant was discarded, and the protein pellet was resuspended in 100 μL of 50 mM ammonium bicarbonate. Proteins were digested with trypsin (cat. No. V528A, Trypsin Gold, Promega, Madison, WI) at a 1:30 (w/w) trypsin protein ratio, for 16 h at 37°C. Afterward, trypsin was added in 1:60 (w/w) trypsin protein ratio for 4 h at 3°C. Subsequently, proteins were fractionated using strong cation exchange cartridges (Thermo Scientific, Bellefonte, PA, USA). Protein fractionation resulted in three phases based on sequential elution with 150, 250, and 500 mM of KCl (elution buffer). Then, each fraction was desalted with C18 cartridges and dried using a CentriVap (Labconco; Kansas City, MO).

Nano LC-MS/MS analysis

Samples were analyzed by nano LC-MS/MS analysis using an Orbitrap FusionTM TribidTM mass spectrometer interfaced with an UltiMate 3000 RSLC system (Dionex; Sunnyvale, CA) and set with an “EASY Spray” nano ion source (Thermo-Fisher Scientific; San Jose, CA). Each reconstituted sample (5 μL) was loaded into a nanoviper C18 trap column (3 µm, 75 µm x 2 cm, Dionex) at 3 μL min-1 flow rate and separated on an EASY spray C18 RSLC column (2 µm, 75 µm × 25 cm), using a 100 min gradient with a flow rate of 300 nL min-1, and using 0.1% formic acid in LC-MS grade water (solvent A) and 0.1% formic acid in 90% acetonitrile (solvent B). The gradient was as follows: 10 min solvent A, 7-20% solvent B within 25 min, 20% solvent B for 15 min, 20-25% solvent B for 15 min, 25-95% solvent B for 20 min, and 8 min solvent A. The mass spectrometer was operated in the positive ion mode with nano spray voltage set at 3.5 kV and source temperature at 280 °C. External calibrants included caffeine, Met-Arg-Phe-Ala (MRFA), and Ultramark 1621.

Decision tree-driven MS/MS

The mass spectrometer was operated in a data-dependent mode. Briefly, survey full-scan MS spectra were acquired in the Orbitrap analyzer. Scanning of the mass range was set to 350-1500 mass/load (m/z) at a resolution of 120,000 full width at half maximum. We used an automatic gain control (AGC) setting to 4.0 × 105 ions, maximum injection time to 50 ms, dynamic exclusion 1 at 90S and 10 ppm mass tolerance. Subsequently, a top speed survey scan for 3 s was selected for subsequent decision tree-based Orbitrap CID or HCD fragmentation [16, 17]. The signal threshold for triggering an MS/MS event was set to 1.0 × 104 and the normalized collision energy was set to 35 and 30% for CID and HCD, respectively. The AGC of 3.0 × 104 and isolation window of 1.6 m/z were set for both fragmentations. Additional parameters for CID included activation Q was set to 0.25 ms and injection time to 50 ms. For HCD, the first mass was set to 120 m/z and injection time to 100 ms. The settings for the decision tree were as follows: For HCD fragmentation charge states 2 or 3 were scanned in a range of 650-1200 m/z, charge states 4 were scanned in a range of 900-1200 m/z, and charge states 5 were scanned in a range of 950-1200 m/z; for CID fragmentation charge states 3 were scan in a range of 650-1200 m/z, charge state 4 were scan in a range of 300-900 m/z, and charge state 5 in scan range of 300-950 m/z. All data were acquired with Xcalibur software (Thermo-Fisher Scientific).

Mass spectra were analyzed with the Proteome Discoverer Program 2.1 (PD, Thermo Fisher Scientific Inc.). The subsequent searches were run using Mascot server (version 2.4.1, Matrix Science, Boston, MA) and SQUEST HT [18]. The search with both engines was conducted against a translated database generated from A. ludens transcriptomic data [5]. The parameters of the analysis comprised: full-tryptic protease specificity, two missed cleavages allowed, static modifications covered carbamidomethylation of cysteine (+57.021 Da). Furthermore, dynamic modifications included methionine oxidation (+15.995 Da) and deamidation in asparagine/glutamine (+0.984 Da). For the MS2 (MS/MS or mass to the tandem) method, in which identification was performed at high resolution in the Orbitrap, precursor and fragment ion tolerances of ±10 ppm and ± 0.2 Da were applied. The resulting peptide hits were filtered for maximum 1% FDR using the Percolator algorithm [19].

In silico protein annotation

With the databases obtained from the Proteome Discoverer Program, a new database was created using a Biovenn diagram, considering only the proteins that were found in both replicates of reproductive tissues with two replicates (i.e., MAG, and reproductive tracts of unmated and mated females); for samples with only one replicate (i.e., bulb and testes), all proteins found were considered for further analysis.

We submitted the FASTA files of proteins to a filtering process for the removal of duplicates. We searched for redundant proteins by first combining all protein sequences from all male and female tissues (n=3,096). We used CD-HIT-DUP (v0.0.1, under Galaxy platform using default parameters) to find duplicates. Based on our analysis of redundancy, we did not find redundant protein sequences. 


Fulbright-Garcia Robles Scholarship

Henry Luce III Fund for Distinguished Scholarship Award from The College of Wooster

Consejo Nacional Consultivo Fitosanitario (CONACOFI)

Instituto de Ecología, Universidad Nacional Autónoma de México