Data from: Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples
Data files
Feb 22, 2018 version files 380.02 MB
-
MIDORI_LONGEST_1.1_A6_RDP.fasta.zip
1.66 MB
-
MIDORI_LONGEST_1.1_A6_SPINGO.fasta.zip
1.58 MB
-
MIDORI_LONGEST_1.1_A8_RDP.fasta.zip
510.94 KB
-
MIDORI_LONGEST_1.1_A8_SPINGO.fasta.zip
455.02 KB
-
MIDORI_LONGEST_1.1_COI_RDP.fasta.zip
20.16 MB
-
MIDORI_LONGEST_1.1_COI_SPINGO.fasta.zip
19.59 MB
-
MIDORI_LONGEST_1.1_COII_RDP.fasta.zip
2.74 MB
-
MIDORI_LONGEST_1.1_COII_SPINGO.fasta.zip
2.62 MB
-
MIDORI_LONGEST_1.1_COIII_RDP.fasta.zip
1.53 MB
-
MIDORI_LONGEST_1.1_COIII_SPINGO.fasta.zip
1.46 MB
-
MIDORI_LONGEST_1.1_Cytb_RDP.fasta.zip
7.49 MB
-
MIDORI_LONGEST_1.1_Cytb_SPINGO.fasta.zip
7.29 MB
-
MIDORI_LONGEST_1.1_lrRNA_RDP.fasta.zip
8.81 MB
-
MIDORI_LONGEST_1.1_lrRNA_SPINGO.fasta.zip
8.54 MB
-
MIDORI_LONGEST_1.1_ND1_RDP.fasta.zip
2.74 MB
-
MIDORI_LONGEST_1.1_ND1_SPINGO.fasta.zip
2.65 MB
-
MIDORI_LONGEST_1.1_ND2_RDP.fasta.zip
4.66 MB
-
MIDORI_LONGEST_1.1_ND2_SPINGO.fasta.zip
4.55 MB
-
MIDORI_LONGEST_1.1_ND3_RDP.fasta.zip
947.12 KB
-
MIDORI_LONGEST_1.1_ND3_SPINGO.fasta.zip
884.32 KB
-
MIDORI_LONGEST_1.1_ND4_RDP.fasta.zip
3.31 MB
-
MIDORI_LONGEST_1.1_ND4_SPINGO.fasta.zip
3.22 MB
-
MIDORI_LONGEST_1.1_ND4L_RDP.fasta.zip
669.91 KB
-
MIDORI_LONGEST_1.1_ND4L_SPINGO.fasta.zip
615.30 KB
-
MIDORI_LONGEST_1.1_ND5_RDP.fasta.zip
3.69 MB
-
MIDORI_LONGEST_1.1_ND5_SPINGO.fasta.zip
3.60 MB
-
MIDORI_LONGEST_1.1_ND6_RDP.fasta.zip
1.09 MB
-
MIDORI_LONGEST_1.1_ND6_SPINGO.fasta.zip
1.03 MB
-
MIDORI_LONGEST_1.1_srRNA_RDP.fasta.zip
4.32 MB
-
MIDORI_LONGEST_1.1_srRNA_SPINGO.fasta.zip
4.15 MB
-
MIDORI_UNIQUE_1.1_A6_RDP.fasta.zip
2.41 MB
-
MIDORI_UNIQUE_1.1_A6_SPINGO.fasta.zip
2.29 MB
-
MIDORI_UNIQUE_1.1_A8_RDP.fasta.zip
651.34 KB
-
MIDORI_UNIQUE_1.1_A8_SPINGO.fasta.zip
577.38 KB
-
MIDORI_UNIQUE_1.1_COI_RDP.fasta.zip
52.75 MB
-
MIDORI_UNIQUE_1.1_COI_SPINGO.fasta.zip
51.05 MB
-
MIDORI_UNIQUE_1.1_COII_RDP.fasta.zip
4.09 MB
-
MIDORI_UNIQUE_1.1_COII_SPINGO.fasta.zip
3.89 MB
-
MIDORI_UNIQUE_1.1_COIII_RDP.fasta.zip
2.08 MB
-
MIDORI_UNIQUE_1.1_COIII_SPINGO.fasta.zip
1.98 MB
-
MIDORI_UNIQUE_1.1_Cytb_RDP.fasta.zip
19.01 MB
-
MIDORI_UNIQUE_1.1_Cytb_SPINGO.fasta.zip
18.43 MB
-
MIDORI_UNIQUE_1.1_lrRNA_RDP.fasta.zip
14.72 MB
-
MIDORI_UNIQUE_1.1_lrRNA_SPINGO.fasta.zip
14.22 MB
-
MIDORI_UNIQUE_1.1_ND1_RDP.fasta.zip
4.03 MB
-
MIDORI_UNIQUE_1.1_ND1_SPINGO.fasta.zip
3.87 MB
-
MIDORI_UNIQUE_1.1_ND2_RDP.fasta.zip
8.52 MB
-
MIDORI_UNIQUE_1.1_ND2_SPINGO.fasta.zip
8.30 MB
-
MIDORI_UNIQUE_1.1_ND3_RDP.fasta.zip
1.24 MB
-
MIDORI_UNIQUE_1.1_ND3_SPINGO.fasta.zip
1.16 MB
-
MIDORI_UNIQUE_1.1_ND4_RDP.fasta.zip
5.20 MB
-
MIDORI_UNIQUE_1.1_ND4_SPINGO.fasta.zip
5.04 MB
-
MIDORI_UNIQUE_1.1_ND4L_RDP.fasta.zip
822.80 KB
-
MIDORI_UNIQUE_1.1_ND4L_SPINGO.fasta.zip
752.99 KB
-
MIDORI_UNIQUE_1.1_ND5_RDP.fasta.zip
5.34 MB
-
MIDORI_UNIQUE_1.1_ND5_SPINGO.fasta.zip
5.20 MB
-
MIDORI_UNIQUE_1.1_ND6_RDP.fasta.zip
1.40 MB
-
MIDORI_UNIQUE_1.1_ND6_SPINGO.fasta.zip
1.32 MB
-
MIDORI_UNIQUE_1.1_srRNA_RDP.fasta.zip
6.68 MB
-
MIDORI_UNIQUE_1.1_srRNA_SPINGO.fasta.zip
6.40 MB
Abstract
Mitochondrial-encoded genes are increasingly targeted in studies using high-throughput sequencing approaches for characterizing metazoan communities from environmental samples (e.g., plankton, meiofauna, filtered water). Yet, unlike nuclear ribosomal RNA markers, there is to date no high-quality reference dataset available for taxonomic assignments. Here, we retrieved all metazoan mitochondrial gene sequences from GenBank, and then quality filtered and formatted the datasets for taxonomic assignments using taxonomic assignment tools. The reference datasets—‘Midori references’—are available for download at www.reference-midori.info. Two versions are provided: (I) Midori-UNIQUE that contains all unique haplotypes associated with each species and (II) Midori-LONGEST that contains a single sequence, the longest, for each species. Overall, the mitochondrial Cytochrome oxidase subunit I gene was the most sequence-rich gene. However, sequences of the mitochondrial large ribosomal subunit RNA and Cytochrome b apoenzyme genes were observed for a large number of species in some phyla. The Midori reference is compatible with some taxonomic assignment software. Therefore, automated high-throughput sequence taxonomic assignments can be particularly effective using these datasets.
- Machida, Ryuji J.; Leray, Matthieu; Ho, Shian-Lei; Knowlton, Nancy (2017), Metazoan mitochondrial gene sequence reference datasets for taxonomic assignment of environmental samples, Scientific Data, Article-journal, https://doi.org/10.1038/sdata.2017.27
