Skip to main content

Species richness in North Atlantic fish: process concealed by pattern

Cite this dataset

Gislason, Henrik et al. (2021). Species richness in North Atlantic fish: process concealed by pattern [Dataset]. Dryad.


Aim Previous analyses of marine fish species richness based on presence-absence data have shown changes with latitude and average species size, but little is known about the underlying processes. To elucidate these processes we use metabolic, neutral, and descriptive statistical models to analyse how richness responds to maximum species length, fish abundance, temperature, primary production, depth, latitude, and longitude, while accounting for differences in species catchability, sampling effort, and mesh size.

Data Results from 53,382 bottom trawl hauls representing 50 fish assemblages.

Location The northern Atlantic from Nova Scotia to Guinea, the Mediterranean, the Arctic Sea.

Time period 1977-2013

Methods A descriptive Generalised Additive Model was used to identify functional relationships between species richness and potential drivers, after which non-linear estimation techniques were used to parameterize: 1) a ‘best’ fitting model of species richness built on the functional relationships, 2) an environmental model based on latitude, longitude and depth, and mechanistic models based on 3) metabolic and 4) neutral theory.

Results In the ‘best’ model the number of species observed is a lognormal function of maximum species length. It increases significantly with temperature, primary production, sampling effort, and abundance, and declines with depth and, for small species, with the mesh size in the trawl. The ‘best’ model explains close to 90% of the deviance and the neutral, metabolic, and environmental models 89%. In all four models, maximum species length and either temperature or latitude account for more than half of the deviance explained.

Main conclusion The two mechanistic models explain the patterns in demersal fish species richness in the northern Atlantic almost equally well. A better understanding of the underlying drivers is likely to require development of dynamic mechanistic models of richness and size evolution, fit not only to extant distributions, but also to historical environmental conditions and to past speciation and extinction rates.


Survey data

Average catch in number of individuals per species and haul was provided from 31 scientific bottom trawl surveys. The earliest trawl hauls were taken in 1977 and the most recent in 2013. Different bottom trawls were used in the surveys. Cod-end mesh sizes ranged from 13 to 40 mm, horizontal trawl openings (wing spread) from 13 to 28 m, vertical openings from 1.9 to 7 m, and towing speeds from 3 to 4.5 knots. Many of the surveys used a stratified random sampling design to account for spatial and depth related differences in species composition. The major stratification used in the surveys was kept, providing richness and density data from 50 different strata. Average depth was calculated as the midpoint of the depth range of each stratum and ranged from 28 to 950 m. Latitude and longitude were calculated as the average of the minimum and maximum coordinates of each survey.

Environmental data

Sea surface temperature, average temperature in the upper 200 m of the water column, and near bottom temperatures (Kelvin) were obtained from the World Ocean Atlas 2013 based on decadal average temperature at 0.25° resolution covering the period 1955-2012 for annual, boreal summer (Jul-Sep) and boreal winter (Jan-Mar). Bottom temperatures were defined as the temperature in the layer closest to the bottom. Spatial averages were calculated for each survey stratum, and the seasonal amplitude calculated as the difference between summer and winter values. Estimates of depth integrated pelagic net primary production (npp, gCm-2y-1) based on the satellite-derived Vertically Generalised Production Model (VGPM) were downloaded from at 1/12 degree monthly resolution for the period 2002-2012, from which estimates of mean annual npp were derived for each survey area.

Fish species data

Only species that are likely to be regularly retained by the survey gear when available (species resting on the seabed, species found close to but not on the seabed, and midwater species with some bottom contact) were included in the dataset. Among the fish taxa recorded some individuals had not been identified to species. If possible, these individuals were allocated to species, assuming that their relative species composition would be identical to that of the individuals identified within the same survey stratum, and family or genus. Where no species from the family or genus had been identified in a stratum, the family or genus name was retained. Information about the maximum length of each species was downloaded from FishBase and used to bin the observations into 11 log maximum length intervals of equal width. In 1% of the species records no maximum species length was available. These records were excluded from the data.

Swept area density for each species was calculated by dividing the average number of individuals caught per haul by the average area swept per haul, estimated by multiplying the wing spread of the trawl by the average distance covered per haul. Swept area abundance was calculated by multiplying swept area density by survey area size. Swept area density and abundance can be converted to absolute density and abundance if catchability is known. Catchability, the fraction of the population in the path of the trawl that is retained and caught by the gear, can be estimated by dividing the swept area estimate of abundance by the absolute abundance provided by a stock assessment. Catchability is likely to differ between areas and species and depends on a number of factors including the properties of the trawl and species-dependent traits such as the size, behavior and distribution of the individuals. The cachability of each species was determined from a catchability model fitted to swept area survey abundance estimates and total abundance from available stock assessments. Average absolute abundance and density were calculated for each species based on 1000 simulated species catchabilities for each species and area.  

With 11 log maximum species size intervals and 50 survey strata the dataset consist of a 550 records and 29 variables. A list of the variable names is provided in Table S1.2 in the supplementary information. Consult the Supplementary Information supplied with the electronic version of the article for further information.

Usage notes

The R-code for the four species richness models and the dataset are also available on GitHub ( in the repository ‘DTUAqua/biodiversity’


EU Network of Excellence, Award: GOCE-CT-2003-505446

European Commission, Award: 266445