Data for: Kelp forest loss and emergence of turf algae reshapes energy flow to predators in a rapidly warming ecosystem
Data files
May 02, 2025 version files 285.34 KB
-
Data1_macroalgae_biomass.csv
121.58 KB
-
Data2_roving_fishes.csv
22.74 KB
-
Data3_bulk_CN_stable_isotopes.csv
53.72 KB
-
Data4_AA_d13C_values.csv
31.63 KB
-
Data5_FDA_consumer_means.csv
2.74 KB
-
Data6_POM_18S.csv
34.63 KB
-
README.md
18.31 KB
Abstract
Climate change is decimating habitat-forming species in ecosystems around the world. Yet the impacts of habitat loss on the energetics of the wider food web remain uncertain for many iconic ecosystems, including cold-water kelp forests. Here, we assessed how the loss of kelp forests and subsequent proliferation of low-lying turf algae in the Gulf of Maine has altered the trophic niches of, and energy acquired by, predatory reef fishes. Bulk tissue δ13C and δ15N analysis of fish muscle showed that fishes in kelp forests had larger trophic niches and greater interspecific niche separation than did fishes on turf reefs. Moreover, δ13C analysis of essential amino acids revealed that kelp-derived energy accounted for the majority of energy used by forest fishes (>50 % on average), whereas fishes on turf reefs compensated for kelp decline via greater reliance on a phytoplankton-based energy channel. Therefore, ecosystem state shifts to turf algae – now a global phenomenon – may have far-reaching impacts on food web energy channels and resilience.
Dataset DOI: 10.5061/dryad.bcc2fqzqz
Description of the data and file structure
Our study took place in four subregions of the Maine coast that span a gradient of temperature and kelp forest habitat condition. At each study site, we conducted surveys to quantify available basal resources (macroalgae) and characterize the fish assemblage. We then collected representative primary producers and the two dominant fish species for stable isotope analysis. We analyzed these fish using both bulk tissue and compound-specific stable isotope analyses to document fish isotopic niches and energy use on kelp-dominated vs. red turf algae-dominated reefs.
Files and variables
File: Data1_macroalgae_biomass.csv
Description: This dataset contains measurements of seaweed biomass (i.e., kelp, red macroalgae, and other seaweeds) collected from replicate 1 square meter quadrats (n = 4-6 per site) at multiple sites and subregions in coastal Maine. Macroalgae biomass surveys were conducted during the summers of 2021 and 2022 on shallow, subtidal rocky reefs (5-7 m depth MLLW) across Maine’s exposed outer coast.
- See “Section 1: Underwater surveys” in the methods for details.
Variables
- survey_date: the date on which the survey was conducted
- site: the name of the location where the survey took place, corresponds to lat and long
- region: the subregion of coastal Maine encompassing the study site (Casco Bay, Midcoast, Penobscot Bay, or Downeast)
- lat: latitude coordinate of survey site (decimal degrees of transect, meter mark 0)
- long: longitude coordinate of survey site (decimal degrees of transect, meter mark 0)
- meter_mark: the distance (in meters) along a given transect where the quadrat was placed for sample collection
- taxa: the lowest taxonomic classification to which the sample was identified in the lab. This is either a species, a genus, or a morphological grouping. Note: in 2021, Vertebrata species were identified to species level (V. fucoides or V. nigra) when possible, whereas in 2022, they were always combined as Vertebrata spp.
- group: a broader classification based on higher-level taxonomy and morphology (kelp, brown_other, green, red_turf, red_other)
- mass_g: the mass (biomass) of the sampled macroalgae, scaled to grams per square meter. Samples were spun (in a salad spinner or mesh bag) to remove excess water before weighing. ‘NA’ = biomass measurement is missing
File: Data2_roving_fishes.csv
Description: This dataset contains fish count and length (in 2.5 cm size bins) estimates recorded during surveys of replicate 40 x 2 m transects on subtidal rocky reefs across the coast of Maine. Biomass (in grams) of each individual was computed using a known, species-specific length-weight relationship.
We used 2.5 cm “size bins” to estimate fish total length. The smallest size bin was 1-2.5 cm, and then bins increased in increments of 2.5 cm (2.5-5, 5-7.5, 7.5-10, etc.). However, some estimates are integrated across multiple bins; for example, when sizing fish from a distance, or schools with many individuals of differing sizes.
Note: each row is an observation, and in many cases, multiple observations were made per transect. ‘na’ = not applicable.
- See “Section 1: Underwater surveys” in the methods for more details.
Variables
- survey_date: the date when the fish survey was conducted
- region: the subregion of Maine encompassing the study site (Casco Bay, Midcoast, Penobscot Bay, or Downeast)
- site: the name of the location where the survey took place, corresponds to lat and long
- lat: latitude coordinate of survey site (decimal degrees of transect 1, meter mark 0)
- long: longitude coordinate of survey site (decimal degrees of transect 1, meter mark 0)
- visibility_m: water visibility at the site in meters, measured by divers as the maximum horizontal distance from which they could see a Secchi disk. A single measurement was taken per survey. ‘NR’ indicates that visibility was not recorded during the survey
- transect: identifier for the transect replicate along which fish were observed and measured (1, 2, or 3). Transects were 40 m x 2 m and ran parallel to the reef at ~5-7 m depth. There was a minimum of 10 m between transects
- taxa: the identity (common name) of the fish observed. ‘no_fish’ indicates no fish were observed along the transect. Fish taxa include pollock (Pollachius virens), cunner (Tautogolabrus adspersus), gunnel (Pholis gunnellus), cod (Gadus morhua), sculpin (Myoxocephalus spp.), and winter_flounder (Pseudopleuronectes americanus)
- length_min: the minimum end of the size bin for the observed fish, in centimeters
- length_max: the upper end of the size bin for the observed fish, in centimeters
- length_cm: length estimate in centimeters, based on average of length_min and length_max
- count: counted or estimated number of fish in a given observation
- a: species-specific coefficient ‘a’ for calculating biomass from length, derived from length-weight parameters on FishBase
- b: species-specific coefficient ‘b’ for calculating biomass from length, derived from length-weight parameters on FishBase
- biomass_g: the estimated biomass of fish (single or multiple individuals) in a given observation in grams, calculated from: (a × length_cm^b ) × count
File: Data3_bulk_CN_stable_isotopes.csv
Description: This dataset contains bulk tissue δ¹³C and δ¹⁵N values for primary producers, invertebrates, and fish collected across the coast of Maine in 2022.
- See “Section 2: Isotope sample collection & processing” in the methods for details about sample collection.
- See “Section 3: Bulk tissue δ¹³C and δ¹⁵N analysis” in the methods for details about sample analysis.
Variables
- sample_code: unique identifier for each sample. The first letter of the code denotes sample type (‘P’ = particulate organic matter (POM), ‘A’ = macroalgae, “I” = invertebrate whole bodies or tissues, ‘F’ = fish muscle tissue)
- taxa: the identity of the sample to the lowest taxonomic level (species, genus, higher taxonomic grouping, or POM)
- group: broader taxonomic or functional group, based on taxa
- type: broader groupings to categorize the sample (POM, macroalgae, invertebrate, or fish)
- region: the subregion of Maine encompassing the sample collection site (Casco Bay, Midcoast, Penobscot Bay, or Downeast)
- site: the name of the sample collection location
- lat: latitude coordinate of the sample collection location (in decimal degrees). Some POM collection sites are approximate; these are noted in notes
- long: longitude coordinate of the sample collection location (in decimal degrees). Some POM collection sites are approximate; these are noted in notes
- collection_date: the date the sample was collected
- length_mm: length of the sampled organism in millimeters, if applicable. ‘na’ if not measured
- wet_weight_g: the measured weight of the sampled organism in grams. ‘na’ if not measured. 'nr' = weight measurement is missing
- d13C: the stable carbon isotope ratio (δ¹³C) of the sample, measured by IRMS and corrected based on internal standards. Units are per mil (‰)
- d15N: the stable nitrogen isotope ratio (δ¹⁵N) of the sample, measured by IRMS and corrected based on internal standards. Units are per mil (‰)
- C.N: the carbon-to-nitrogen ratio of the sample, measured by IRMS
- notes: additional observations or metadata related to the sample. ‘na’ if no notes were made
File: Data4_AA_d13C_values.csv
Description: This dataset contains stable carbon isotope (δ¹³C) values of individual amino acids for primary producers and predatory fishes following sample hydrolysis, derivatization, and analysis in a gas-chromatography combustion unit coupled to an isotope-ratio mass spectrometer (GC-C-IRMS). The values have been corrected based on internal reference materials. Data about each sample, reference material, and GC-C-IRMS run are included. The corresponding bulk tissue δ¹³C and δ¹⁵N values for these samples are in Data3_bulk_CN_stable_isotopes.csv; however, this data file includes seven additional samples that were collected in 2023.
Columns with a three-letter amino acid code contain δ¹³C values obtained by GC-C-IRMS, then corrected based on their reference material, or their standard deviation between injections for that amino acid (denoted by ‘SD’). The three-letter code specifies the amino acid (see below). δ¹³C values may be ‘na’ if we did not obtain data for the particular amino acid in that sample. SD values may be ‘na’ if no usable δ¹³C values were collected or only one injection produced usable δ¹³C values. Units are per mil (‰).
- See “Section 2: Isotope sample collection & processing” in the methods for details about sample collection.
- See “Section 4: Stable carbon isotope analysis of essential amino acids (δ¹³CEAA)” in the methods for details about sample analysis.
- See “Section 5: Flexible discriminant analysis (FDA) to quantify consumer energy channel use” for details on how we use these data to quantify the proportional contribution of various energy channels to consumers.
Variables
- sample_code: unique identifier for each sample. First letter of code denotes sample type (‘P’= particulate organic matter (POM), ‘A’ = macroalgae, ‘F’ = fish muscle tissue)
- taxa: the identity of the sample to the lowest taxonomic level (species, genus, higher taxonomic grouping, or POM)
- group: broader taxonomic or functional group, based on taxa
- type: broader groupings to categorize the sample (POM, macroalgae, or fish)
- site: the name of the sample collection location
- region: the subregion of Maine encompassing the sample collection site (Casco Bay, Midcoast, Penobscot Bay, or Downeast)
- collection_date: the date the sample was collected
- length_mm: length of the sample organism in millimeters, if applicable. ‘na’ if not measured
- hydrolysis_mass_mg: mass of the sample used for hydrolysis in milligrams. ‘na’ if not applicable, as for POM samples, since the samples were on a filter
- ref_1: identifier for one internal reference material with which the sample was derivatized. The three-letter code denotes standard type (UNM or STD), the second three-number code denotes standard batch
- ref_2: identifier for a second internal reference material with which the sample was derivatized. The three-letter code denotes standard type (UNM or STD), the second three-number code denotes standard batch. ‘na’ if only one internal reference material was made with the respective hydrolysis batch
- runID: numeric identifier for the GC-C-IRMS run
- run_date: date of the GC-C-IRMS run
- STD: identifier number of the reference material which was run on the GC-C-IRMS alongside the sample in question and used to correct the δ¹³C values for each amino acid (3-number code corresponding to ref_1 or ref_2)
- STD.t: type of the reference material which was run on the GC-C-IRMS alongside the sample in question and used to correct the δ¹³C values for each amino acid (UNM or STD, corresponding to number code in ref_1 or ref_2)
- N.inj: number of injections made for the sample during analysis, from which the following values were derived
- Ala13C: alanine δ¹³C value, ‰
- Ala13C.SD: alanine δ¹³C standard deviation between injections
- Gly13C: glycine δ¹³C value, ‰
- Gly13C.SD: glycine δ¹³C standard deviation between injections
- Thr13C: threonine δ¹³C value, ‰. This is an essential amino acid
- Thr13C.SD: threonine δ¹³C standard deviation between injections
- Ser13C: serine δ¹³C value, ‰
- Ser13C.SD: serine δ¹³C standard deviation between injections
- Val13C: valine δ¹³C value, ‰. This is an essential amino acid
- Val13C.SD: valine δ¹³C standard deviation between injections
- Leu13C: leucine δ¹³C value, ‰. This is an essential amino acid
- Leu13C.SD: leucine δ¹³C standard deviation between injections
- Ile13C: isoleucine δ¹³C value, ‰. This is an essential amino acid
- Ile13C.SD: isoleucine δ¹³C standard deviation between injections
- Pro13C: proline δ¹³C value, ‰
- Pro13C.SD: proline δ¹³C standard deviation between injections
- Asp13C: aspartic acid δ¹³C value, ‰
- Asp13C.SD: aspartic acid δ¹³C standard deviation between injections
- Glu13C: glutamic acid δ¹³C value, ‰
- Glu13C.SD: glutamic acid δ¹³C standard deviation between injections
- Phe13C: phenylalanine δ¹³C value, ‰. This is an essential amino acid
- Phe13C.SD: phenylalanine δ¹³C standard deviation between injections
- Tyr13C: tyrosine δ¹³C value, ‰
- Tyr13C.SD: tyrosine δ¹³C standard deviation between injections
- Lys13C: lysine δ¹³C value, ‰. This is an essential amino acid
- Lys13C.SD: lysine δ¹³C standard deviation between injections
- Arg13C: arginine δ¹³C value, ‰
- Arg13C.SD: arginine δ¹³C standard deviation between injections
File: Data5_FDA_consumer_means.csv
Description: This dataset is the output from a bootstrap resampling of a flexible discriminant analysis (FDA) classification of consumer (predatory fish) energy channels. This dataset can be directly derived from δ¹³C data of five essential amino acids (Ile, Leu, Thr, Val, and Phe) from Data4_AA_d13C_values.csv; however, because resampling randomly subsets the primary producer data, there may be slight deviations in each iteration. This output was used in our analyses and figure generation.
- See “Section 4: Stable carbon isotope analysis of essential amino acids (δ¹³CEAA)” in the methods for how the original δ¹³CEAA measurements were made.
- See “Section 5: Flexible discriminant analysis (FDA) to quantify consumer energy channel use” for details on how this dataset was derived.
Variables
- sample_code: unique identifier for the sample. All samples are predatory fish (pollock or cunner), and their sample metadata (including which subregion they are from) and original EAA δ¹³C values can be found in Data4_AA_d13C_values.csv
- K.mean: mean probability (from N bootstrap iterations) that the EAA of this fish was derived from kelp
- K.sd: standard deviation of kelp estimates
- P.mean: mean probability (from N bootstrap iterations) that the EAA of this fish was derived from phytoplankton
- P.sd: standard deviation of phytoplankton estimates
- R.mean: mean probability (from N bootstrap iterations) that the EAA of this fish was derived from red macroalgae
- R.sd: standard deviation of red macroalgae estimates
- N: number of iterations used to calculate the means and standard deviations. Set to 10,000
File: Data6_POM_18S.csv
Description: This dataset is derived from amplicon sequencing of the 18SV4 gene region, and taxonomic assignments were made using a dada2 bioinformatics pipeline with the PR2 database. We retained all ASVs that were among the 99% most abundant in each sample, and grouped them here based on species-level taxonomic assignments.
- See “Section 2: Isotope sample collection & processing” in the methods for details about POM sample collection.
- See “Section 6: Particulate organic matter (POM) as a proxy for phytoplankton” in the methods for details about POM sample sequencing and data processing.
Variables
- sample_code: unique identifier for the POM sample, which matches those in the bulk (Data3_bulk_CN_stable_isotopes.csv) and amino acid (Data4_AA_d13C_values.csv) stable isotope datasets
- region: subregion of Maine where the POM sample was collected. Samples were collected at a depth of ~1m below the surface and ~ 1km offshore of the nearest coastline
- tax.Kingdom: kingdom-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Supergroup: supergroup-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Division: division-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Class: class-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Order: order-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Family: family-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Genus: genus-level taxonomic classification, from a dada2 assignment based on the PR2 database
- tax.Species: species- or lowest taxonomic-level classification, from a dada2 assignment based on the PR2 database
- group: manually assigned ecological or functional grouping based on dada2 taxonomic assignments
- phytoplankton: categorical indicator of whether the taxon is phytoplankton. ‘yes’ if the group is an autotrophic microalgae. ‘no’ if other (e.g., microheterotrophs or animals)
- read_abundance: total number of sequence reads for the taxon in the sample
- read_proportion: proportion of reads for the taxon relative to the total reads in the sample
Section 1: Underwater surveys
Data1_macroalgae_biomass.csv
Data2_roving_fishes.csv
Macroalgae surveys.
During the summers of 2021 and 2022, we characterized macroalgae assemblages on subtidal rocky reefs via scuba (n = 11-16 sites per year, 3-5 per subregion). At each site, divers laid a 40 m transect along the isobath at 5-7 m depth, parallel to shore, then placed 1 m2 quadrats (n = 4-6 quadrats per site) at predetermined intervals along the transect. We collected all kelps (i.e., canopy-forming brown macroalgae in the order Laminariales) from the 1 m2 quadrat and all understory macroalgae (i.e., the diverse consortium of bladed, foliose, and filamentous seaweed taxa residing under the canopy) from a ¼ m2 area of the quadrat. In the lab, we sorted macroalgae by species, identified them to the lowest possible taxonomic level, dried them by spinning, and weighed each. We performed these surveys between July and mid-September (oceanographic summer) of both years.
Fish surveys.
To characterize the mobile reef fish assemblage at each site, we conducted visual fish censuses. While swimming along 40 m transects (n = 3 per site), a diver identified, counted, and estimated the size (total length, 2.5 cm size bins) of each fish that they observed within a 2 m wide band. We calculated the biomass of each fish observation via published length-weight relationships from FishBase (70). Fish surveys were performed at the same time as macroalgae surveys in each subregion (n = 11-16 sites per year, 3-5 sites per subregion) between July and September of each year.
Section 2: Isotope sample collection & processing
Data3_bulk_CN_stable_isotopes.csv
Data4_AA_d13C_values.csv
Data5_FDA_consumer_means.csv
Data6_POM_18S.csv
Collecting primary producer samples.
Based on our macroalgae data and observations of the ecosystem, we determined that two potential benthic energy channels (i.e., bottom-associated, functionally distinct primary producer groups) could exist on shallow, subtidal rocky reefs in the Gulf of Maine: one based on kelps (canopy-forming brown macroalgae, including subtidal Laminariales and intertidal fucoids) and another based on red macroalgae (seaweeds in the phylum Rhodophyta). To characterize the isotopic values of these possible energy sources, abundant macroalgae were collected by hand via scuba, concurrent with macroalgae biomass collections (see above). For subtidal kelps (n = 19, e.g., Laminaria digitata and Saccharina latissima), we took samples from clean parts of the interior blade, approximately 15 cm distal to the growth meristem. For all other macroalgae (n = 44), we selected whole thalli that were free of fouling. In order to comprehensively characterize the isotopic signature of all abundant basal energy resources, we supplemented our subtidal kelp samples with fucoidian brown macroalgae (Ascophyllum nodosum and Fucus spp., n = 6) that were ubiquitous in intertidal habitats adjacent to our subtidal study sites. Since fucoids and kelps share some ecological functions and can have similar isotopic signatures, we included the fucoid samples in the “kelp” isotope endmember group. All macroalgae samples were carefully rinsed with DI water to remove epifauna.
To characterize the isotopic signature of local phytoplankton (which constitute a pelagic energy channel), we collected particulate organic matter (POM) from 1 m depth with a Niskin sampler at sites ~1 to 5 km offshore in all subregions (n = 27). We prefiltered water through an inline 300 µM nitex mesh filter to remove zooplankton or large particulates. In the lab, we collected POM by passing 2.5-4.4 L of the water through pre-combusted GF/F filters (0.7 µM mesh size) with a vacuum filter. We sequenced the DNA from a portion of several POM filters (18S metabarcoding) to confirm that phytoplankton comprised the bulk of the organic material trapped on the filter (Supplementary Text, Fig. S5, Fig. S6). Hence, we refer to POM as “phytoplankton” here and elsewhere, for simplicity.
Collecting consumer tissue samples.
To characterize the isotopic signatures of common primary consumers, we collected samples of invertebrates that represented three distinct primary consumer guilds and may serve as fish prey. For this study, we focused on filter-feeding bivalves (blue mussels Mytilus edulis, and rock borer clams Hiatella arctica), grazing snails (Lacuna vincta and Margarites helicinus), and amphipods (which comprise a variety of families but are largely detritivores or surface suspension feeders) (71). These organisms (n = 3-43 samples per group per subregion) were collected from survey sites in late summer of 2022. Divers collected mussels by hand via scuba, and all small, epifaunal mesoinvertebrates were collected from macroalgae by rinsing the macroalgae in DI water in the lab and catching the contents in a 500 µM sieve. To isolate mesoinvertebrates, we looked through the contents of the sieve under a dissecting microscope and sorted mesoinvertebrates based on the lowest identifiable taxonomic level (often species level for bivalves and snails, and family or superfamily for amphipods). We were careful to remove pieces of detritus or macroalgae from mesoinvertebrates.
Invertebrates were frozen after collection. Tissue samples from Mytilus were obtained by dissecting a piece of the adductor muscle. We processed whole snails and Hiatella as bulk samples and subjected them to demineralization in weak (0.5M) hydrochloric acid for 12 to 24 hours to remove calcium from their shells. We then extracted lipids from all invertebrate samples by soaking them for 24 hours in a 2:1 chloroform-methanol solution. After three rounds of soaking (72 hours total), exchanging the chloroform:methanol solution between rounds, we rinsed samples thoroughly in DI water.
We collected fish from rocky reefs in Casco Bay, Midcoast, Penobscot Bay, and Downeast subregions between September and November of 2022. We used hook & line fishing to target pollock (n = 10-14 per subregion) and deployed minnow traps baited with mussels to catch cunner (n = 5-15 per subregion). Fishes were euthanized in the field using an MS-222 seawater solution in accordance with the University of Maine IACUC (protocol A2022-08-02). Specimens were temporarily stored on ice, then dissected in the lab the same day they were caught. We excised a piece of dorsal muscle tissue for isotopic analysis, which was rinsed with DI water and stored frozen. As with invertebrate tissues, we lipid-extracted fish tissues with a 2:1 chloroform:methanol solution for 72 hours and then rinsed them with DI water.
We kept samples in muffled 20 ml scintillation vials (borosilicate glass, with foil-lined caps). All samples were stored frozen at -20 °C until they were lyophilized.
Section 3: Bulk tissue δ13C and δ15N analysis
Data3_bulk_CN_stable_isotopes.csv
We packed tissue samples into 3.5 x 5 mm tin capsules (Analytics, Valencia, CA; USA Analytics, Anaheim, CA) to prepare them for bulk tissue δ13C and δ15N analysis. We weighed kelp and other macroalgae samples to ~3.5 mg, and invertebrate samples and fish muscle to ~1 mg. For phytoplankton samples, we packed ~¼ to ½ of each GF/F filter into 5 x 9 mm tin capsules.
Bulk tissue δ13C and δ15N values were measured via continuous flow on a Costech 4010 elemental analyzer coupled to a Thermo Scientific Delta V Plus isotope ratio mass spectrometer (EA-IRMS) at the University of New Mexico Center for Stable Isotopes (UNM-CSI, Albuquerque, New Mexico, USA). We report all isotope results as δ values with units of per mil (‰):
δ13C or δ15N = [(Rsample/Rstandard) - 1] × 1000,
where R represents the 13C:12C or 15N:14N ratios (72). The internationally accepted standards are Vienna-Pee Dee Belemnite for δ13C and atmospheric N2 for δ15N. The isotopic values of our samples were corrected and calibrated to these international standards based on analysis of in-house reference materials (casein and tuna for protein, green chile and blue gramma for plants/algae), which had standard deviations below 0.2 ‰ within all runs.
Section 4: Stable carbon isotope analysis of essential amino acids (δ13CEAA)
Data4_AA_d13C_values.csv
Data5_FDA_consumer_means.csv
We used δ13CEAA fingerprinting to assess the energy (carbon) contribution of basal resources to fish from the four subregions across the coast of Maine. To prepare samples for δ13CEAA analysis, we subjected them to hydrolysis, derivatization, and analysis in a gas-chromatography combustion unit coupled to an isotope-ratio mass spectrometer (GC-C-IRMS). First, we hydrolyzed macroalgae samples (4-5 mg), POM filters (~¼ to ½ filter), and lipid-extracted fish muscle (5-6 mg) in 6M hydrochloric acid. We flushed hydrolysis tubes with N2 gas before sealing and incubated them for 20 hours at 110 °C. Hydrolyzed amino acids from primary producer samples were subsequently filtered through muffled quartz wool to remove any remaining particulates.
We derivatized amino acids into N-trifluoroacetyl isopropyl esters following Silfer et al. (73). Hydrolysates were subsequently dried under N2, esterified in 1 mL of 4:1 isopropanol:acetyl chloride (105°C, 1 h), dried again under N2 with two rinse cycles using dichloromethane (DCM), and then acetylated in 1 mL of 1:1 trifluoracetic anhydride:DCM (105 °C, 10 min). All samples for this project were derivatized in batches of 8-25 samples, along with an in-house reference material containing amino acids with known δ13C values measured via EA-IRMS at UNM-CSI (Table S5).
The in-house reference material was a mixture of 12 purified and powdered amino acids (Sigma Aldrich, Saint Louis, MO, USA): alanine, aspartic acid, glutamic acid, glycine, isoleucine, leucine, lysine, phenylalanine, proline, serine, threonine, and valine. To make the reference material, we dissolved amino acid powders into weak hydrochloric acid (< 0.01 M) at a concentration of ~125 mM. Individual amino acid solutions were then mixed together, and a small aliquot of this mixture was dried under N2 gas and derivatized alongside each batch of unknown samples.
We analyzed our derivatized samples and in-house reference material via a GC-C-IRMS at UNM-CSI. Briefly, we injected 1-1.5 µL of each sample into a 60 m BPX5 gas chromatograph column (0.32 mm ID, 1 µm film thickness, SGE Analytical Science) within a Trace 1310 GC, where amino acid separation was completed, then combusted into CO2 within a high temperature furnace (1000 °C) of a Thermo-Scientific Isolink II, and finally analyzed in a Thermo Scientific Delta V Plus isotope ratio mass spectrometer. We ran all samples in duplicate or triplicate injections, taking the average isotopic values of each sample. The within-run standard deviations of all amino acid δ13C values for the in-house reference material measured in this study ranged from 0.2 ‰ (alanine) to 0.5 ‰ (lysine) but averaged < 0.5 ‰ per day, as averaged across 3 to 7 injections. The global standard deviations ranged from 0.5 ‰ (valine) to 1.3 ‰ (phenylalanine) across all runs and standard injections (Table S5).
δ13C values were then corrected based on measurements of the in-house reference material (Table S5), which was analyzed bracketing each sample and at the beginning and end of each run. The reagents used during derivatization add carbon to the side chains of amino acids, and δ13C values measured via GC-C-IRMS are thus a combination of intrinsic amino acid carbon and reagent carbon. However, because amino acid reference materials of known δ13C composition were derivatized and run with each batch of samples, we were able to correct for this carbon addition for each amino acid using the following equation:
δ13Csample = (δ13Cdsa - δ13Cdst + δ13Cstd × pstd) / pstd
Here, δ13Cdsa refers to the measured value of a derivatized amino acid within the sample, and δ13Cdst is the measured value of the derivatized amino acid within the in-house reference material. The term δ13Cstd reflects the un-derivatized, or intrinsic, δ13C value of that amino acid in the reference material; these values were determined via EA-IRMS at UNM-CSI (Table S5). Finally, pstd is the proportion of carbon in the measured derivative that was originally sourced from the amino acid, which varies among amino acids. These corrections were done on a daily basis, or on a ‘per run’ basis, which means unknown samples were corrected using four to six bracketed injections of the reference material.
We generated δ13C values for 12 amino acids, however, our analyses focused on five of the six EAA that we could reliably measure in our primary producers: threonine, valine, leucine, isoleucine, and phenylalanine. These compounds are useful for tracing the movement of organic matter across food webs, as they are not synthesized or modified by consumers and therefore do not undergo discrimination as they move between trophic levels (51, 52). We did not use data for amino acids in samples where standard deviations exceeded 1.0 ‰ among injections. In addition, although lysine is considered essential and was measured, this EAA exhibited significant coelution with tyrosine in many of our primary producer samples, resulting in variable δ13C values. Hence, we did not include lysine data in any statistical analyses.
Section 5: Flexible discriminant analysis (FDA) to quantify consumer energy channel use
Data4_AA_d13C_values.csv
Data5_FDA_consumer_means.csv
Partitioning primary producer groups with a classification model.
To determine whether δ13CEAA could distinguish kelp (n = 16), red macroalgae (n = 17), and phytoplankton (n = 11) – the three primary producer groups that represent distinct energy channels in our system – we employed a flexible discriminant analysis (FDA), a non-parametric classification model. We applied the FDA [R package ‘mda’ (79)] using δ13CEAA from five essential amino acids (isoleucine, leucine, phenylalanine, threonine, valine) as predictors, and assessed the robustness of producer ‘fingerprints’ based on successful reclassification in a leave-one-out cross-validation test. Kelps, red macroalgae, and phytoplankton were sufficiently separated in multivariate δ13CEAA space (see results, Fig. 3, Table S6), and thus we used this FDA model for quantifying energy contributions to consumers.
Quantifying consumers’ basal energy sources.
To quantify the proportion of energy from each producer group used by fishes, we used a bootstrap resampling approach to run 10,000 iterations of the FDA model with a random subset of 10 members of each producer group each time (sampled with replacement). We used each iteration of the model to classify fish based on their δ13CEAA, with the model returning a probability that each individual fish belonged to each producer group. We averaged these posterior probability estimates from all 10,000 iterations to get a mean estimate for each individual (Table S4) and used those mean probabilities of classification as the estimated proportion of energy derived from each producer group (18). We chose to use this multivariate statistical method (FDA) rather than traditional Bayesian mixing models [e.g., ‘MixSIAR’ (82)] since they can more effectively utilize multiple EAA to partition the proportional contribution of multiple basal resource groups (18). δ13CEAA data naturally avoids the common issues encountered with bulk tissue isotopic data that make mixing models so useful for these studies. Notably, multivariate isotopic patterns of marine algae are conserved across broad spatiotemporal scales (52), and EAA have negligible isotopic offsets with trophic transfer in most cases [e.g., (51, 83)].
Section 6: Particulate organic matter (POM) as a proxy for phytoplankton
Data6_POM_18S.csv
To understand the isotopic signature of phytoplankton, we collected water samples (2.5-4.4 L, n = 6-8 per region) and filtered the associated particulate organic matter (POM) onto pre-combusted 0.7 µM pore size GF/F filters. We cut filters into halves, thirds, or quarters to use for both bulk tissue and compound-specific stable isotope analysis. To verify whether the identity of “POM” was phytoplankton species, we sequenced the 18S gene (see below) from a portion of 17 POM filters (n = 3-6 per region).
We extracted the DNA from POM filters using a Qiagen PowerSoil DNA extraction kit following the manufacturer's quick-start protocols, except for the first step, where we added the filter, beads, and lysis buffer to a 5 mL tube for manual lysis. We sent extracts to be indexed and sequenced with a one-step PCR using eukaryote-specific primers targeting the 18S rRNA V4 gene region (E572F & E1009R primers, (84). Amplicon sequencing was carried out on a NextSeq instrument at the Integrated Microbiome Resource (Dalhousie University, Halifax, Nova Scotia, CA). We processed the dereplicated sequences using a dada2 pipeline (85). Our quality filtering parameters were maxEE = (2,2), auto truncLen, trunQ = 2, and quantile_min = 0.8. We used a minOverlap of 20 bp when merging paired-end reads. We assigned taxonomy to amplicon sequence variants [ASVs (86)] with the PR2 database (87).
From 16 samples, we received 625,616 total reads. Since we were interested in the most abundant components of the POM community, we filtered out all reads assigned to species that made up less than 1 % of the total reads in each sample. We also removed two samples with < 1000 reads (P2: 24 reads, and P10: 365 reads). After filtering, we were left with 14 samples with 34,410 ± 5,419 reads (mean ± SE).
We combined the reads that matched to common microalgae or other broad taxonomic groups and reveal that 86 % of reads did indeed come from phytoplankton (a group of single celled microalgae reliant at least in large part on autotrophy; Fig. S5) and that the major phytoplankton groups across the coast of Maine are diatoms and dinoflagellates (Fig. S6).
