Skip to main content
Dryad

The CALFISH database: A century of California's non-confidential fisheries landings and participation data

Cite this dataset

Free, Christopher M. et al. (2022). The CALFISH database: A century of California's non-confidential fisheries landings and participation data [Dataset]. Dryad. https://doi.org/10.25349/D9M907

Abstract

California's commercial and recreational fisheries support vibrant coastal economies and communities. Maintaining healthy fishing communities into the future requires a detailed understanding of their past. The California Department of Fish and Wildlife (CDFW) has been monitoring statewide fisheries landings and participation since 1916 and releases confidential versions of this data through authorized data requests and non-confidential summaries of this data in its quasi-annual landings reports. The non-confidential data published in the landings reports provide a rich history of California's fisheries but are scattered across 1000s of tables in 100 s of documents, limiting their accessibility to researchers, fishers, and other interested stakeholders. We reviewed the 58 landings reports published from 1929 to 2020 and extracted and carefully curated 13 datasets with long time series and wide public interest. These datasets include: (1) annual landings in pounds and value by port and species from 1941 to 2019; (2) annual number of commercial fishing vessels by length class from 1934 to 2020; (3) annual number of licensed commercial fishers by area of residence from 1916 to 2020; and (4) annual number of party boat (CPFV) vessels, anglers, and their total catch by species from 1936 to 2020. Notably, we harmonized port names, species common names, and species scientific names across all years and datasets. We make these curated datasets, collectively called the CALFISH database, publicly available to any interested stakeholder in the supplementary materials of this paper, on an open-access data-repository, and in the wcfish R package. These datasets can be used (1) to understand the historical context of California's fisheries; (2) for original research requiring only summaries of historical landings and participation data; and (3) to anticipate the likely characteristics of confidential data requested from the state. We conclude the paper by identifying key principles for increasing the accessibility and utility of historical fisheries landings and participation data.

Methods

The California Department of Fish and Wildlife (CDFW) has been monitoring statewide fisheries landings and participation since 1916 and releases confidential versions of this data through authorized data requests and non-confidential summaries of this data in its quasi-annual landings reports. The non-confidential data published in the landings reports provide a rich history of California’s fisheries but are scattered across 1000s of tables in 100s of documents, limiting their accessibility to researchers, fishers, and other interested stakeholders. We reviewed the 58 landing series reports published by CDFW from 1928 to 2020 and extracted and curated 13 datasets of long length (years) and wide public interest. In general, these datasets describe landings and participation in commercial fishing and the CPFV sector of recreational fishing (i.e., recreational fishing from private boats and shore are not described in these reports). We rigorously quality controlled all of the extracted data and enhanced the datasets with additional attributes of interest where possible. Notably, these enhancements included harmonizing common names across years and datasets and linking common names with scientific names.

Usage notes

The landings datasets curated below describe landings in terms of both volume (pounds) and value (dollars). The values reflect nominal ex-vessel values and have not been adjusted for inflation. The volumes are reported “without regard to condition” and reflect the volumes reported on the original landings receipt (i.e., they have not been universally converted to round weights). Although most fish and shellfish are landed in round (whole) condition, some species may be eviscerated (gutted), dressed, or beheaded before being brought ashore, but this is not recorded in the data. This is especially common for barracuda, shark, salmon, sablefish, white seabass, and swordfish. A few market categories do include descriptions of condition (i.e., Pacific herring roe, Pacific herring roe on kelp, Chinook/coho salmon roe, spider/sheep crab claws, and crab claws) but there is no guidance on how to interpret these descriptions. We provide an attribute for condition with four options -- roe, roe on kelp, claws, and not specified -- but caution against using these attributions without further clarification from the state.

The CDFW datasets report landings by market categories that are not always species specific. Furthermore, these market categories are described using common names rather than scientific names. Although a key for relating common and scientific names is provided at the beginning of each Fish Bulletin-hosted landings report, the conventions for common names and alignment with scientific names varies throughout the landings series. We rigorously harmonized common names across years and datasets and associated common names with updated scientific names with guidance from the Fish Bulletin species keys. To ease analysis, maintain transparency, and allow users to make different decisions regarding species identities, every dataset with species-specific information includes the original common name, the harmonized common name, and the updated scientific name. We also provide a key for appending additional taxonomic information (i.e., phylogenetic groups and/or commercial categories) to any of the curated datasets. Overall, the landings data include 397 market categories representing 12 phyla, 25 classes, 68 orders, 130 families, and 200 genera.

Finally, many of the datasets published in the landings series report statistics for individual fishing ports or for groups of fishing ports called “port complexes”. However, the naming conventions for ports and the delineation of port complexes varies throughout the landings series. To ease analysis, we harmonized port and port complex attributes across years and datasets. In most cases, harmonizing port names involved straightforward decisions (e.g., “Bay”, “Bay (Bodega)”, and “Bodega Bay” all refer to Bodega Bay). However, in some cases, nuanced decisions were required. Namely, we decided that references to “Tomales Bay (Marshall)”, “Princeton (Half Moon Bay)”, and “Point Reyes (Drakes Bay)” imply “Tomales Bay & Marshall”, “Princeton & Half Moon Bay”, and “Point Reyes & Drakes Bay”. This decision was based on the fact that, in some years, statistics are separated for these commonly paired ports. We used slashes to denote grouped ports (e.g., “Tomales Bay/Marshall” indicates both Tomales Bay and Marshall together) in the harmonized port names. We retained the original port name in the curated datasets to make our decisions transparent and to allow users to make different decisions. The geographical delineation of port complexes varied throughout the landings series (Figure S1) with: 13 complexes defined by county lines in FB 15-44 (1926-1930), 8 complexes defined by natural landmarks in FB 44-49 (1931-1935), 7 complexes defined by county lines in FB 57-173 (1936-1986), and 9 complexes defined by county lines in FB 181 and the website-hosted reports (1987-2019). We used the recent 9-complex typology in the curated datasets but provide a key to summarize data based on the older typologies. This key also includes the coordinates (lat/long) of each port.

Funding

The Nature Conservancy