Data from: Historical biogeography supports Point Conception as the site of turnover between temperate East Pacific ichthyofaunas

Published Sep 14, 2023 on Dryad. https://doi.org/10.5061/dryad.2ngf1vhvk

Data files

Sep 14, 2023 version files 299.85 MB

CA_fishes_dryad_package_Aug_24_2023.zip

299.85 MB
README.md

5.48 KB

Abstract

The cold temperate and subtropical marine faunas of the Northeastern Pacific meet within California as part of one of the few eastern boundary upwelling ecosystems in the world. Traditionally, it is believed that Point Conception is the precise site of turnover between these two faunas due to sharp changes in oceanographic conditions. However, evidence from intraspecific phylogeography and species range terminals do not support this view, finding stronger biogeographic breaks elsewhere along the coast. Here I develop a new application of historical biogeographic approaches to uncover sites of transition between faunas without needing an a priori hypothesis of where these occur. I used this approach to determine whether the point of transition between northern and southern temperate faunas occurs at Point Conception or elsewhere within California. I also examined expert-vetted latitudinal range data of California fish species from the 1970s and the 2020s to assess how biogeography could change with the backdrop of climate change. The site of turnover was found to occur near Point Conception, in concordance with the traditional view. I suggest that recent species- and population-level processes could be expected to give signals of different events from historical biogeography, possibly explaining the discrepancy across studies. Species richness of California has increased since the 1970s, mostly due to species’s ranges expanding northward from Baja California (Mexico). Range shifts under warming conditions seem to be increasing the disparity between northern and southern faunas of California, creating a more divergent biogeography.

Historical biogeography supports Point Conception as the site of turnover between temperate East Pacific ichthyofaunas
Elizabeth Christina Miller
Contact: lizmiller2633@gmail.com
Prepared September 13, 2023

README file

The R project "CA_fishes.Rproj" helps run all R scripts within a common directory. Click on this first before running any scripts.

The folder "biogeobears_model_fitting" contains all R scripts, input files, and outputs resulting from BioGeoBEARS analyses of RAY-FINNED FISHES

R script "fit_biogeobears_models_CAfishes.R" fits the models. 
Inputs of this script are:
	rabosky_forbgb.tree (phylogeny)
	ca_fishes_phylip.txt (range codes for species, where A=marine but outside Northeast Pacific, B=within Northeast Pacific but north of California, C=within California, D=within Northeast Pacific but south of California, E=freshwater, diadromous or brackish )
	ca_fishes_dispmat (dispersal matrix)
R script "run_stochastic_mapping.R" should be run second, performs stochastic mapping on the best-fit model
Objects with suffix ".Rdata" are outputs of the two scripts

The folder "biogeobears_model_fitting_chondr" contains all R scripts, input files, and outputs resulting from BioGeoBEARS analyses of CARTILAGINOUS FISHES

R script "fit_biogeobears_models_CAfishes_chondr.R" fits the models. 
Inputs of this script are:
	chondr_tree_bgb.tree (phylogeny)
	chondr_phylip.txt (range codes for species, letters same as above)
	ca_fishes_dispmat (dispersal matrix)
R script "run_stochastic_mapping_chondr.R" should be run second, performs stochastic mapping on the best-fit model
Objects with suffix ".Rdata" are outputs of the two scripts

After fitting BioGeoBEARS models and running stochastic mapping analyses, proceed to the folders with the prefix "process_biogeobears".

The folder "process_biogeobears" contains scripts and data files relevant to RAY-FINNED FISHES. The folder "process_biogeobears_chondr" does the same for CARTILAGINOUS FISHES. Assume these scripts run the same way

Script "1_get_all_states.R" is a preparation script that extracts all states at all nodes inferred by biogeographic stochastic mapping for each of 100 stochastic maps. At the moment, this inelegantly outputs 100 csv files with the reconstructed states

the folder "all_states" is a holder for the output csv files from Script 1. Each csv file is a single stochastic map. These are inputs for the next step.

Script "2_get_individual_cols.R" contains a custom function that identifies individual colonization events to a region of choice (in this case, individual colonizations to California waters, this region was letter "C" in my BioGeoBEARS analyses). The script reads in each of those 100 csv files we made before.

Running the custom function will automatically output two "csv" files: "ind_cols_C_sourceregion_bymap.csv" which contains the source region of all lineages colonizing California, and "ind_cols_C_time_bymap.csv" which contains the timing of the colonization event. I only used the former in this study, but the latter might be useful to some folks with different use cases

These csv files are structured as: each row is a tip in the tree corresponding to a species in the focal region (in my case, California), and each column represents a stochastic map (V1–V100), where there are 100 columns for each stochastic map.
The letters in each cell is the source region: the script walks backwards from that species to the moment when the lineage colonized California waters, and extracts the source of that colonization from the stochastic map.

The relevant value can be summarized across the stochastic maps for each species in a manner of the researcher's choosing.

Script "3_tally_colonizations.R" creates the output file "ind_cols_C_sourceregion_bymap_withtallies.csv", which simply appends columns to the end of the csv counting the number of stochastic maps with a northern vs southern origin for the species (i.e. counts the number of "B" versus "D" as the source region).
As described in the manuscript, I considered >70 stochastic maps to be the cut-off to code the region of origin for each species

The folder "plotting_data" contains all inputs and R scripts required to create the figures in the manuscript
The script "4_plot_data.R" produces all the plots.
Input files for plotting can all be found in the folder "plotting_files"
Note that I made all of the input files for this script by hand, in order to get them in a ggplot-friendly format, and also to combine results for ray-finned and cartilaginous fishes. I'm sure there's way to automate this but I just found it easier to do by hand

	Figure 2 panels are made from the input files in the folder "species_richness". They contain the count of species found in each latitudinal band by region-of-origin which we determined in script 3 above (separated by benthic/pelagic and ray-finned versus cartilagenous fishes)
	Figure 3 panels are made from the input files in the folder "delta". This is the difference between the number of species with a northern origin and the number with a southern origin, for every latitudinal band
	Figure 4 panels are made from the input files in the folders "S_range_termini" and "N_range_termini". This is the count of species whose range ends at each latitudinal band, by region-of-origin

The folder "results" is just an output folder to export the plots made by script 4.

Data from: Historical biogeography supports Point Conception as the site of turnover between temperate East Pacific ichthyofaunas

Data files

Abstract

README

Methods

Usage notes

Works referencing this dataset