Pacific Atoll Vegetation Maps
Data files
Dec 11, 2024 version files 480.55 MB
-
atoll-shapefiles-gpkg-2024-12-04.zip
17.25 MB
-
classification_PDFs.zip
360.06 MB
-
classification_rasters.zip
97.97 MB
-
classification-processing.zip
2.34 KB
-
classification.zip
4.31 MB
-
master-atoll-database-2024-04-16.csv
84.13 KB
-
master-image-list-2024-04-16.csv
74.87 KB
-
master-island-database-2024-04-29.csv
651.60 KB
-
README.md
35.37 KB
-
validation.zip
110.64 KB
Abstract
Vegetation classification maps of 235 Pacific atolls (1,925.6 km2 in total) featuring four land cover classes (broadleaf tree canopy, coconut palm canopy, low vegetation, and non-vegetated surface) at 2 m resolution. Coconut palms are mapped with a balanced accuracy of 85.3%, producer’s accuracy (sensitivity or recall) of 82.5%, user’s accuracy (positive predictive value) of 68.7%, and specificity of 88.1%. Balanced accuracies for broadleaf tree canopy and low vegetation were lower (75.5% and 70.3%, respectively), in part because these classes often appear similar in satellite imagery. Non-vegetated land was classified with a balanced accuracy of 87.7%. The 235 classification maps feature an overall accuracy of 71.1%, significantly higher than the no-information rate of 34.4% (p = 2.2e−16). Across the 235 mapped atolls, 36.6±1.0% of vegetated surfaces featured a coconut palm canopy. By area, 58.3±1.8% of tree canopies (i.e. excluding low-statured vegetation) were coconut palm. A patch classifier identified 310.9 km2 of dense, monodominant coconut stands across the 235 mapped atolls, representing 51.2% of the study-wide coconut area. The classification maps are provided as georeferenced GeoTIFF files as well as PDF files for ease of viewing. Tabular databases including per-atoll and per-islet land cover data are also included, along with geopolitical and historical data about each atoll.
README: Very High Resolution Vegetation Maps of 235 Pacific Atolls
https://doi.org/10.5061/dryad.0k6djhb7x
Michael W. Burnett - The Nature Conservancy and University of California Santa Barbara - mburnett@ucsb.edu
Description of the data and file structure
Overview
This Dryad deposit contains data related to the article "Satellite imagery reveals widespread coconut plantations on Pacific atolls" by M.W. Burnett et al. in Environmental Research Letters. Specifically, the following data may be accessed here:
- 235 georeferenced rasters containing the vegetation classifications of Pacific atolls at 2 m spatial resolution (GeoTIFF format)
- 235 shapefiles containing summary land cover data and unique identification numbers for Pacific atolls' individual islands (GPKG format)
- 221 georeferenced PDF files containing vegetation classifications of Pacific atolls at 2 m spatial resolution, plus unique island identification numbers (GeoPDF format). PDF maps were not produced for 14 of the classified atolls due to technical problems, but all the information included in the PDFs can be extracted from other files in this repository.
- Tabular database containing land cover, rainfall, historical, and geopolitical data for 266 Pacific atolls (CSV format)
- Tabular database containing land cover data for 8,944 islands (CSV format)
- Tabular database describing the imagery used to create the land cover classifications (CSV format)
The GeoTIFF and GPKG files are only recommended for users with access to GIS software who want to closely examine spatial data from individual atolls or run further analyses. Users who want to consult the land cover maps without GIS software should use the PDF files, although some detail is lost in these files especially on larger atolls, or they can consult the The Nature Conservancy's Geospatial Conservation Atlas (see Sharing/Access information below). Users seeking summary data for certain islands, atolls, regions, or countries should use the tabular databases.
Raster Files
The raster files represent the foundation of this dataset, containing complete land cover data for 235 Pacific atolls at 2 m spatial resolution. Every pixel in one of these files either contains no data (representing water) or is assigned a value from the following classification scheme:
Land cover type | Raster pixel value |
---|---|
Coconut canopy | 0 |
Broadleaf tree canopy | 1 |
Low vegetation | 2 |
Non-vegetated land | 5 |
Cloud obstruction | 9 |
The "Cloud obstruction" category represents areas that are probably land, but were obstructed by clouds in the imagery used to produce the land cover classification. These areas were masked by hand.
Each raster is provided as a GeoTIFF file georeferenced to the WGS84 UTM zone containing the center of the image. For instance, Kure Atoll, located at 28°25′N 178°20′W, just east of the 180th meridian, is located in UTM Zone 1N and uses EPSG:32601 as its coordinate reference system. GeoTIFFs are byte-type (eight bit unsigned integer) and compressed with lossless LZW compression.
Shapefiles
Shapefiles include the outlines of all 235 classified atolls, with one feature corresponding to every landmass with an area greater than 100 square meters. The outlines were generated directly from the land cover classification rasters, wherein water was masked using the near-infrared bands of WorldView-2 in a binary random forest classifier. As the rasters have 2 m spatial resolution, zooming in close to the shapefiles will reveal "jagged" coastlines.
Some shapefiles also have small areas where feature outlines extend outside of the classified rasters; these are artifacts of errors in the image classification process that were subsequently masked out of the rasters, but which are still visible in the shapefiles. The total area of these errors is negligible relative to the size of the islands.
Examining the attribute table of one of the shapefiles will reveal ten fields (columns). "fid" and "DN" are processing-related fields that can be disregarded by users. The six columns beginning with "HISTO_" correspond to the land cover types presented above in the Raster Files section; values in these fields reflect the number of pixels within a feature that were classified as that class. "HISTO_NODATA" counts small numbers of erroneous pixels included in the outlines (see above). "total" sums the number of pixels classified as 0, 1, 2, 5, and 9; i.e., the total number of land pixels making up the landmass. Pixel counts in the "HISTO_" and "total" fields can be converted to square meters by multiplying by 4, since each pixel is 2 meters by 2 meters. Finally, "id" is a unique identifier given to each of the 8,944 islands mapped by this study.
PDF Maps
To provide non-GIS users with a means of visualizing the map data, all 235 rasters are provided as pre-rendered PDF files. These files are also georeferenced (GeoPDF) in the same manner as the GeoTIFF rasters, in case one wants to use them in a geospatial application. Legends, scalebars, and north arrows are provided in each PDF map (scales vary widely given the range of atoll sizes). The PDFs do not include bitmaps at their full resolutions, so progressively more detail is lost as one views larger atolls and users should not expect to be able to examine maps at 2 m resolution. Because of bitmap rendering, some classification pixels also blend together and form different colors if the PDF is zoomed in closely.
Tabular Atoll Database
The "master-atoll-database-2024-04-16.csv" file contains summary information about the 266 Pacific atolls identified by our study, including geopolitical information, location, rainfall, copra production history, protection, and land cover data. The columns of this tabular database are explained below:
Column name | Explanation |
---|---|
Atoll | The most common, generally accepted name that we could determine for each atoll. |
Alternative names | Any other names we found for an atoll, including local names and obsolete names from nautical charts, are included in this column. These lists should not be considered exhaustive. |
Type | Atolls are classified into three types: "Atoll", "Closed Atoll", and "Coral Island". Closed atolls feature continuous reef rims without deep breaks connecting the lagoon to the ocean. Coral islands do not possess a lagoon but are geologically similar to atolls; these are sometimes called "table reefs". For more information see Goldberg (2016) in Atoll Research Bulletin. |
Country | The country to which each atoll belongs. |
Group | The archipelago or island group to which an atoll belongs. Some groups include atolls from multiple countries. |
Subgroup | Where applicable, the more specific island group to which an atoll belongs. |
Lat | Latitude in decimal degrees. |
Lon | Longitude in decimal degrees. |
Average Rainfall (mm/yr) | Mean annual rainfall in units of mm per year derived from GPM IMERG V06. |
Inhabited? | Binary classification of whether an atoll has permanent inhabitants (1) or is uninhabited (0). This information was sourced on an ad-hoc, informal basis and should be used with caution. |
History of copra production | Atolls are assigned "Yes" if an internet search yielded confirmation that copra was produced and exported from the island at some point in history. "No" indicates the history of the atoll is well known and no significant copra production occurred. "Unknown" means no information could be found in either direction. |
Copra reference | URL leading to resources used to determine copra history. |
Elevation | The highest elevation of each atoll above sea level (m), according to Nunn et al. (2016) in Geoscience Letters. |
cocos km2 | Total area classified as coconut canopy (square km). |
broadleaf km2 | Total area classified as broadleaf tree canopy (square km). |
shrub km2 | Total area classified as low vegetation (square km). |
non_veg km2 | Total area classified as non-vegetated land (square km). |
cloud km2 | Total area classified as cloud and subsequently masked (square km). |
total km2 | Sum of previous 5 columns: total land area of the atoll (square km). Note that this total may neglect large sandbars that were excluded in the image acquisition process (which targeted only areas with or near vegetation) and may thus diverge from other estimates of atoll land area. |
total non-cloud km2 | Sum of coconut, broadleaf, low vegetation, and non-vegetation areas, excluding cloud obstructions (square km). This column will underestimate some atolls' land areas, but the total non-cloud areas were used to calculate proportional land cover data for each atoll. |
cocos% | Proportion of an atoll's non-cloud-obstructed land area classified as coconut canopy (%). |
broadleaf% | Proportion of an atoll's non-cloud-obstructed land area classified as broadleaf tree canopy (%). |
shrub% | Proportion of an atoll's non-cloud-obstructed land area classified as low vegetation (%). |
non_veg% | Proportion of an atoll's non-cloud-obstructed land area classified as non-vegetated land (%). |
cloud% | Proportion of an atoll's land area obstructed by clouds in the classified image(s) (%). |
cocos/veg% | Proportion of an atoll's vegetated area (including coconuts, broadleaf trees, and low vegetation) classified as coconut canopy (%). This is called Coconut Canopy Fraction in the* ERL* article. |
cocos/tree% | Proportion of an atoll's tall vegetation (including coconuts and broadleaf trees) classified as coconut canopy (%). |
monocrop km2 | Total area classified as a coconut palm monocrop (square km). |
monocrop coconut km2 | Total area of coconut palm canopy contained within monocropped areas (square km). |
% of coconut existing in monocrop | Proportion of an atoll's total coconut canopy area contained within monocropped areas (square km). |
Tabular Island Database
The "master-island-database-2024-04-29.csv" file contains summary land cover information for 8,944 islands identified by our study. The columns of this tabular database are explained below:
Column name | Explanation |
---|---|
ID | Unique ID for each island. These IDs correspond to the "id" column in the shapefiles, and to the labels on the PDF maps. |
Atoll | Name of the atoll to which each island belongs. |
HISTO_0 | Number of pixels (2x2 m) classified as coconut canopy on each island. |
HISTO_1 | Number of pixels (2x2 m) classified as broadleaf tree canopy on each island. |
HISTO_2 | Number of pixels (2x2 m) classified as low vegetation on each island. |
HISTO_5 | Number of pixels (2x2 m) classified as non-vegetated land on each island. |
HISTO_9 | Number of pixels (2x2 m) masked as clouds on each island. |
total | Total number of pixels (2x2 m) that comprise each island. |
cocos% | Proportion of an island's non-cloud-obstructed land area classified as coconut canopy (%). |
non_veg% | Proportion of an island's non-cloud-obstructed land area classified as non-vegetated land (%). |
cocos/veg% | Proportion of an island's vegetated area (including coconuts, broadleaf trees, and low vegetation) classified as coconut canopy (%). This is called Coconut Canopy Fraction in the* ERL* article. |
broadleaf% | Proportion of an island's non-cloud-obstructed land area classified as broadleaf tree canopy (%). |
total km2 | Total land area (square km) of each island, including cloud-obstructed areas. |
Tabular Atoll Database
The "master-image-list-2024-04-16.csv" file contains information about the 459 very high resolution multispectral satellite images acquired for this project. All images were acquired from Maxar's WorldView-2 satellite. While the images themselves cannot be published due to licensing considerations, information about each image (including its unique identifier) are provided below. This file also includes information on the phase angle-based classification that was applied to each image.
Column name | Explanation |
---|---|
Atoll | Name of atoll targeted by image. |
Datetime UTC | The date and time (UTC) at which an image was captured by WorldView-2. |
Maxar Image ID | Unique identifier for all Maxar images. |
Target lat | Latitude of the target atoll in decimal-degrees. |
Off-nadir angle | Angle (deg) between the satellite's nadir (vertical line between the satellite and the Earth's surface directly below it) and the center of the satellite image strip at the moment of image acquisition. |
Sun elevation | Angular elevation (deg) of the sun from the perspective of the target atoll at the moment of image acquisition. |
Sun azimuth | Solar azimuth angle (deg) from north to the sun from the perspective of the target atoll at the moment of image acquisition. |
Target azimuth | Azimuth angle (deg) indicating the compass direction from north from the satellite to the target atoll. |
Camera elevation | Angular elevation (deg) of the satellite from the perspective of the target atoll at the moment of image acquisition. |
Camera azimuth | Azimuth angle (deg) from north to the satellite from the perspective of the target atoll at the moment of image acquisition. |
Sun RA | Right ascension (deg) of the sun from the perspective of the target atoll at the moment of image acquisition. |
Sun d | Declination (deg) of the sun from the perspective of the target atoll at the moment of image acquisition. |
Camera RA | Right ascension (deg) of the satellite from the perspective of the target atoll at the moment of image acquisition. |
Camera d | Declination (deg) of the satellite from the perspective of the target atoll at the moment of image acquisition. |
Phase angle | Great circle angle (deg) between the satellite and the sun from the perspective of the target atoll at the moment of image acquisition. |
Best classification phase angle range | Range of phase angles (plus or minus) ultimately used to select training data to classify an image. |
Sharing/Access information
Land cover classifications and some tabular data can be viewed on The Nature Conservancy's Geospatial Conservation Atlas:
- Primary link: https://geospatial.tnc.org/apps/2bd5e7a72c63416ca5e137d840a0da93/explore
- Alternate link: https://arcg.is/1fmfmf
Code/Software
Classifications were created, processed, and analyzed using R 4.0.3 to 4.3.1 and a wide range of packages including terra, raster, sp, sf, glcm, randomForest, doParallel, foreach, caret, dplyr, rgdal, tidyr, and more. Several R scripts are included in this repository:
- classification.zip contains most of the scripts and resources used to create the land cover classifications:
- Maxar_GLCM_AWS_v2.R is built to run on AWS EC2 instances running a custom AMI from Louis Aslett. The script takes a 50 cm panspectral WorldView-2 image and generates eight textural features (grey level co-occurrence matrix, or GLCM features) for the image, then pushes the eight texture rasters to an AWS S3 bucket. Note that some of the software and infrastructure needed for this to run may now be deprecated!
- atoll-classifier-v2.R reads in a training point database (see below) and produces water-masked versions of each raw image (raw images featuring eight spectral and eight textural features at co-registered 2 m resolution: coastal, blue, green, yellow, red, red-edge, NIR1, NIR2, mean, variance, homogeneity, contrast, dissimilarity, entropy, second moment, correlation). After water-masking each image, the script includes sections that use the random forest algorithm to classify every image using the entire training point database as well as restricted subsets of the database with different phase angle (SSA) tolerances (see article for more information). Each image is thus classified five times. As stated in the article, the water masking portion of this script was run once, after which the classification sections were essentially run three times on progressively smaller sets of images (once in each training round using a different training data set).
- trainingData.all.r1.csv contains the training points (with eight spectral and eight textural values per point) used for training round 1. Metadata for the satellite image from which the points were extracted are included, including phase angle (SSA).
- trainingData.all.r2.csv - as above, but for training round 2.
- trainingData.all.r3.csv - as above, but for training round 3.
- mass-classifier-v2-island-list-2022-12-30.csv contains a list of images from which the atoll-classifier script skims metadata.
- classification-processing.zip contains three R scripts that were used to process and analyze the raw random forest classifications:
- majority-filter-v2.R applies a 3x3 pixel (36 square m) moving-window majority filter to the classifications.
- area_summer_v2.R counts the pixels of each class in every majority-filtered classification and returns the results as a CSV file. This script was used to extract the land cover data used in the article and provided in the tabular databases.
- monocrop-v2.R uses a dominance threshold (user-adjustable) and a moving window of a set size (user-adjustable) to highlight areas of coconut palm monodominance in the majority-filtered classifications. This script was used to produce the monocrop statistics that appear in the ERL article.
- validation.zip contains the script and resources used to validate the land cover classifications:
- validator-v2.R reads an independent set of validation datapoints randomly sampled from the 235 Pacific atolls and labeled by trained observers. The script then extracts the predicted land cover values from the appropriate classification rasters, outputting a database of predicted and true land cover datapoints as a CSV file.
- random_pts_2023_08_02_v1_wgs84.geojson is the validation point database used by the script.
- validation_v3.csv is the combined observed-and-predicted database produced by the script.
- validation_v2.txt contains a confusion matrix and various accuracy statistics generated by the R script from the validation database.
- island-data-regressions-v3.R uses the atoll and island land cover databases to construct the linear models and accompanying figures that appear in the* ERL* article.
The WorldView-2 imagery used to produce the land cover classifications cannot legally be shared on this repository. The GLCM layers used for classification are not hosted here because of their very large sizes (each GLCM file is an eight-layer raster at 50 cm resolution with float32 datatype, often covering huge expanses of ocean). Additional data, intermediary processing outputs, and scripts can be requested from the author (mburnett@ucsb.edu).
Methods
Maps are based on an interative random forest classification with spectral and textural features extracted from WorldView-2 imagery and trained and validated by human observers using 44,000 training points and nearly 1,969 validation points. See related manuscripts for more methodological informaion:
Burnett, M.W., French, R., Jones, B., Fischer, A., Holland, A., Roybal, I., White, T.D., Steibl, S., Anderegg, L.D.L., Young, H., Holmes, N.D., Wegmann, A., 2024. Satellite imagery reveals widespread coconut plantations on Pacific atolls. Environmental Research Letters 19, 124095. https://doi.org/10.1088/1748-9326/ad8c66