1. Environmental DNA (eDNA) metabarcoding is a promising method to monitor species and community diversity that is rapid, affordable, and non-invasive. Longstanding needs of the eDNA community are modular informatics tools, comprehensive and customizable reference databases, flexibility across high-throughput sequencing platforms, fast multilocus metabarcode processing, and accurate taxonomic assignment. As bioinformatics tools continue to improve, addressing each of these demands within a single bioinformatics toolkit is becoming a reality. 2. Here we present an open access modular metabarcode sequence toolkit, Anacapa (https://github.com/limey-bean/Anacapa/), that addresses the above needs, allowing users to build comprehensive reference databases and process raw multilocus metabarcode sequence data to accurately characterize communities. A novel aspect of Anacapa is our database builder, Creating Reference libraries Using eXisting tools (CRUX), that generates comprehensive reference databases for specific user-defined metabarcode loci. The Quality Control and Dereplication module sorts and processes multiple metabarcode loci and processe merged, unmerged and unpaired reads maximizing recovered diversity. Next DADA2 detects amplicon sequence variants (ASVs) and the Anacapa Classifier module aligns these ASVs to CRUX-generated reference databases using Bowtie2. Taxonomy is assigned to ASVs with confidence scores using a Bayesian Lowest Common Ancestor (BLCA) method. The Anacapa toolkit also includes an R package, ranacapa, for automated results exploration through standard biodiversity statistical analysis. 3. Comparative tests to other published reference databases show that CRUX generates broad, comprehensive reference databases that capture more taxonomic diversity. A variety of benchmarking approaches show that the Anacapa Classifier module’s Bowtie2-BLCA assigns robust, high-quality taxonomy to both MiSeq and HiSeq-length eDNA metabarcode sequences. We further demonstrate the utility of the Anacapa Toolkit by assigning taxonomy to eDNA sequences from terrestrial and marine samples from southern California through CaleDNA (http://www.ucedna.com/). 4. The Anacapa Toolkit broadens the exploration of eDNA and assists in biodiversity assessment and management by generating metabarcode specific databases, processing multilocus data, retaining all read types, and expanding non-traditional eDNA targets. Anacapa software and source code are open and available in a virtual container to ease installation.

CRUX-CO1

Filtered CRUX-CO1 reference database

CO1.zip

CRUX-FITS

Filtered CRUX-FITS (fungal its) reference database

FITS.zip

CRUX-PITS

Filtered CRUX-PITS (Plant ITS2) reference database

PITS.zip

CRUX-16S

Filtered CRUX-16S reference database

16S.zip

CRUX_18S

Filtered CRUX_18S reference database

18S.zip

CRUX_12S

Filtered CRUX_12S reference database

12S.zip

CRUX_V8-9_18S

Filtered CRUX_V8-9_18S reference database

V8-9_18S.zip

CRUX_V4_18S

Filtered CRUX_V4_18S reference database

V4_18S.zip

Anacapa

https://github.com/limey-bean/Anacapa as of 5-10-2019

CRUX_Creating-Reference-libraries-Using-eXisting-tools

https://github.com/limey-bean/CRUX_Creating-Reference-libraries-Using-eXisting-tools as of 5-10-2019

R-P-H

CRUX length Restricted Porter & Hajibabaei CO1 database

R-mitofish

CRUX length restricted mitofish reference database

R-Midori

CRUX length restricted Midori reference database

R-CO-ARBitrator

CRUX length restricted CO-ARBitrator reference database

R_UNITE

CRUX length restricted UNITE reference database

R-WITS

CRUX length restricted WITS reference database

Data from: Anacapa Toolkit: an environmental DNA toolkit for processing multilocus metabarcode datasets

Data files

Abstract

CRUX-CO1

CRUX-FITS

CRUX-PITS

CRUX-16S

CRUX_18S

CRUX_12S

CRUX_V8-9_18S

CRUX_V4_18S

Anacapa

CRUX_Creating-Reference-libraries-Using-eXisting-tools

R-P-H

R-mitofish

R-Midori

R-CO-ARBitrator

R_UNITE

R-WITS

Data from: Anacapa Toolkit: an environmental DNA toolkit for processing multilocus metabarcode datasets

Data files

Abstract

Usage notes

CRUX-CO1

CRUX-FITS

CRUX-PITS

CRUX-16S

CRUX_18S

CRUX_12S

CRUX_V8-9_18S

CRUX_V4_18S

Anacapa

CRUX_Creating-Reference-libraries-Using-eXisting-tools

R-P-H

R-mitofish

R-Midori

R-CO-ARBitrator

R_UNITE

R-WITS

Works referencing this dataset