Integrating pigment and fatty acid profiles for enhanced estimation of seston community composition
Data files
May 30, 2024 version files 93.05 MB
-
Litmanen_et_al_2024_Ecosphere_data_and_codes.zip
93.04 MB
-
README.md
5.75 KB
Abstract
Climate change, nutrition pollution, and land use alterations influence the primary production of lakes. While light-microscopy counting remains the standard for estimating phytoplankton community composition, its expense and time-consuming nature necessitate cost-effective alternatives for seston analysis. Furthermore, estimating the contribution of seston constituents other than primary producers, or non-algal particles, is not possible with light-microscopy counting. Biotracer approach using computational methods and chemotaxonomic biomarkers such as carotenoid pigments and fatty acids have been utilised as an alternative in seston analysis when species-level taxonomy is not required. However, a comprehensive testing of how well carotenoid and fatty acids can be utilised in estimating a wide range of seston phytoplankton communities using different estimation methods is lacking. To assess the accuracy of a suite of state-of-the-art biotracer-based computational methods, namely CHEMTAX, FASTAR (Fatty Acid Source-Tracking Algorithm in R), MixSIAR, and QFASA (Quantitative Fatty Acid Signature Analysis), lake water samples were collected in 2016, 2018, 2019, 2020, and 2021 for seston composition analysis in a boreal eutrophic lake with light-microscopy counting serving as the reference for seston composition. Absolute errors between the biotracer-based estimates were calculated to evaluate method performance. A small laboratory experiment to assess the reliability of estimating the contribution of non-algal particles using the computational methods with fatty acids was also conducted. The closest alignment to light-microscopy counting in terms of absolute error was achieved when both carotenoids and fatty acids were utilised together in the QFASA method. For CHEMTAX, FASTAR, and MixSIAR utilising carotenoids alone produced the closest results. Additionally, the estimation methods accurately assessed the proportion of non-algal particles in the seston when utilising fatty acid profiles, a capability not possible with light-microscopy counting. Our findings demonstrate that the biotracer approach provides a viable and cost-effective alternative to light-microscopy counting when group-level information of phytoplankton community composition suffices. Furthermore, we show that non-algal particles can be effectively estimated together with phytoplankton when utilising fatty acids.
https://doi.org/10.5061/dryad.t1g1jwt9v
The study contained two separate experiments.
The first experiment—named here ‘Main_experiment’—consisted of seston field samples where biomolecules, namely pigments (hereafter referred as carotenoids) and fatty acids, of the seston samples acquired from Lake Vesijärvi, Finland, were analysed and the biomolecule data was used to estimate the composition of the samples. The estimates were produced computationally in R with CHEMTAX, FASTAR, MixSIAR and QFASA utilising a published biomolecule profile source library supplemented with original additions of non-algal particle biomolecule profiles. Five different versions of the source library were tested, three with only phytoplankton; carotenoids only, fatty acids only, carotenoids and fatty acids together, and two with phytoplankton and non-algal particles; fatty acids, and carotenoids and fatty acids together. The phytoplankton community composition estimates were compared to light-microscopy counting results from either the same sampling day or a day sufficiently close to sampling derived from an open-source phytoplankton database of the Finnish Environment Institute (Hertta).
The second experiment—named here ‘POC_experiment’—consisted of a small laboratory experiment where a gradient of alder leaf extract incubated in natural lake water and green algae Chlamydomonas reinhardtii were produced to assess the computational methods’ ability to distinguish between non-algal particles and algae with all four aforementioned estimating methods. This test utilised the version of source library where fatty acids with non-algal particles included.
Description of the data and file structure
The folder ‘Main_experiment’ includes three folders and two R-scripts.‘Data’-folder contains the field sample biomolecule profiles (‘field_samples.csv’), the proportional light-microscopy phytoplankton community compositions (‘microscopy.csv’) used as a reference in evaluating the estimating accuracy of the computational methods, and the source library that was used by the computational methods (‘source_library.csv’). The light-microscopy counting file ‘micoscopy.csv’ also contains the origin of the data, and the depth and date of sampling. The source library file ‘source_library.csv’ contains the used group name, origin of the profiles, phytoplankton class and species for each profile. As the classification of plants differs from phytoplankton, the class information is not present for the Reed, tPOM and tPOMb groups (marked ‘NA’). The source library file ‘source_library.csv’ was also used to produce the PCA presented in the manuscript. ‘Estimates’-folder contains the ‘estimates.csv’-file, where the raw estimation samples are all in one file. These data were used with the aforementioned microscopy data to calculate the results and construct the figures and tables in the manuscript. The ‘Results’-folder contains the raw results from each individual estimation run, where folders are named by the estimation method and within them the result files (e.g., ‘160706A_1_2.csv’) are named by a code consisting of sample date, the biomolecules used and whether non-algal particles were included. The R-script ‘estimation.R’ includes the estimation code for all aforementioned estimation methods and the framework to go through all different combinations of method and source library and it saves the estimates in the ‘Results’-folder. The code requires JAGS-software to be installed (for FASTAR and MixSIAR). The R-script ‘resultCombiner.R’ combines the raw results from the ‘Results’-folder to one file in the aforementioned ‘Estimates/estimates.csv’.
The folder ‘POC_experiment’ follows a similar structure. Here, in the ‘Data’-folder the laboratory experiment fatty acid profiles for each experimental ratio (green algae : tPOM) are contained in the file ‘lab_samples_POC.csv’ and the source library with only fatty acid profiles is in the file ‘source_library_POC.csv’. In the ‘Estimates’ folder is the ‘POC_estimates.csv’ which was used for the creation of results, figures, and tables related to the second experiment. ‘Results’ folder contains folders named after the computational methods and the result files (e.g., ‘AP1090.csv’) are named by a code that responds to the ratio of the green algae and non-algal particles in the samples. The R-scripts ‘estimation_POC.R’ and ‘resultCombiner_POC.R’ are slightly modified versions of ‘estimation.R’ and ‘resultCombiner.R’ in the main experiment to account for different needs of this experiment.
Sharing/Access information
Data was derived from the following sources:
- Biomolecule source library (Peltomaa et al. 2023, Appendix A, Multimedia component 3): https://doi.org/10.1016/j.phytochem.2023.113624
- Phytoplankton database of the Finnish Environment Institute: https://www.avoindata.fi/data/en_GB/dataset/kasviplanktontietojarjestelma-kplank
Code/Software
Version 4.3.1 of R and version 4.3.1 of JAGS were used. In the R-scripts ‘estimation.R’ and ‘estimation_POC.R’, the following R-packages were used: limSolve 1.5.7, MASS 7.3-60, matrixStats 1.2.0, MixSIAR 3.1.12, qfasar 1.2.1, and R2jags 0.7-1. Notably the function ‘est_diet()’ of qfasar was modified to return the bootstrap sample and named ‘diet_est_boot()’.