Spatial-temporal expression analysis of lineage-restricted shell matrix proteins in the slipper snail Crepidula atrasolea reveals shell field regionalization and distinct cell populations
Data files
May 31, 2024 version files 45.39 GB
-
Hybrid-Transcriptome.zip
295.88 MB
-
kinfin-scripts.zip
13.13 KB
-
kinfin.tar.gz
1.28 GB
-
ortho-scripts.zip
11.30 KB
-
Proteomes-Apr2024.tar.gz
717.68 MB
-
README.md
3.04 KB
-
Results_Apr17.tar.gz
38.56 GB
-
Results_Apr27.tar.gz
3.06 GB
-
Results_Apr29_1.tar.gz
1.45 GB
-
SMP_trees.zip
17.65 MB
Abstract
Mollusca is a morphologically diverse metazoan phylum, exhibiting an immense variety of calcium carbonate shells. Biomineralization of the shell involves shell matrix proteins (SMPs). While SMP diversity is hypothesized to drive molluscan shell diversity, we are just starting to unravel SMP evolutionary history and biology. Here we leveraged two complementary molluscan model systems, the marine slipper snails Crepidula fornicata and Crepidula atrasolea, to determine the evolutionary lineage of 185 Crepidula SMPs. We found that 95% of the adult C. fornicata shell proteome belongs to conserved metazoan and molluscan orthogroups, with molluscan-restricted orthogroups containing half of all SMPs in the shell proteome. The low number of C. fornicata-restricted SMPs contradicts the generally-held notion that an animal’s biomineralization toolkit is dominated by mostly novel genes. Next, we selected a subset of SMPs across evolutionary lineages for spatial-temporal analysis using in situ hybridization chain reaction (HCR) during shell development in C. atrasolea. We found that 12 out of the 18 SMPs we analyzed are expressed in the shell tissue. These transcripts are present in 5 expression patterns, which define at least three distinct cell populations within the shell field. These results represent the most comprehensive analysis of gastropod SMP evolutionary age and shell field expression patterns to date. Collectively, these data lay the foundation for future work to interrogate the molecular mechanisms and cell fate decisions underlying molluscan mantle specification and diversification.
https://doi.org/10.5061/dryad.zpc866tf1
Orthofinder2 and Kinfin results run on April 2024.
1. Orthofinder Dataset
Results_Apr17.tar.gz
: Part 1 Orthofinder results (submit with -og: Stop after inferring orthogroups). Script2_ortho_infer_orthogroups.sh
was used to run this step.- Comparative_Genomics_Statistics
- Orthogroups
- Orthogroup_Sequences
- Single_Copy_Orthologue_Sequences
Results_Apr27.tar.gz
: Part 2 Orthofinder results (submit with -fg: Start analysis from orthogroups OrthoFinder directory) and -ot ( Stop after inferring gene trees for orthogroups). Script3_ortho_infer_gene_trees.sh
- Gene_Trees
- MultipleSequenceAlignments
- Orthogroup_Sequences
- Orthologues
Results_Apr29_1.tar.gz
: Part 3 Orthofinder results (submit with -ft: Start analysis from gene trees in OrthoFinder directory). Script4_rooted_gene_trees.sh
- Gene_Duplication_Events
- Orthologues
- Phylogenetic_Hierarchial_Orthogroups
- Resolve_Gene_Trees
- Comparative_Genomics_Statistics
- Phylogenetically_Misplaced_Genes
- Putative_Xenologs
- Species_Tree
Proteomes-Apr2024.tar.gz
: Fasta files for 95 species proteomes used to run the orthofinder analysis.SMP_trees.tar.gz
: SMP gene trees for 185 SMPs from Crepidula fornicata and their multiple sequence alignment (MSA) files.- smp-tree: Directory containing SMP orthogroup gene trees
- smp-msa: Directory containing SMP orthogroup multiple sequence alignments
Ortho_scripts.tar.gz
: Commands used to run orthofinder analysis- 1_copy_proteomes_april2024.sh
- 2_ortho_infer_orthogroups.sh
- 3_ortho_infer_gene_trees.sh
- 4_rooted_gene_trees.sh
2. Kinfin Dataset
kinfin.tar.gz
: Kinfin results and evolutionary analysis of SMP lineage-restricted genes- kinfin_results
- Analysis
- class
- family
- genus
- order
- phylum
- superfamily
- TAXON
- tree
- 185-SMPs
- class-kinfin-output
- genus-kinfin-output
- TAXON-kinfin-output
- family-kinfin-output
- order-kinfin-output
- phylum-kinfin-output
- superfamily-kinfin-output
- kinfin_results
kinfin_scripts.tar.gz
: Commands used to run kinfin analysis- 1_submit_kinfin.sh
- 2_post.kinfin.sh
3. Hybrid Transcriptome
10_original_Catra_transcripts_stranded_cdhit95_stranded_noAlien.fasta
: Nucleotide transcriptome for C. atrasolea. Uploaded and accepted to NCBI/TSA (GKTO01000000)10_original_Catra_transcripts_stranded_cdhit95_stranded_noAlien.pep
: Translated proteome of hybrid transcriptome for C. atrasolea. Used in orthofinder analysis