A curated database of milky sea observations from 1600 to present
Data files
Mar 11, 2025 version files 423.09 KB
-
Milky_Sea_Database.tsv
419.12 KB
-
README.md
3.97 KB
Abstract
Milky seas are a rare form of nocturnal oceanic bioluminescence distinguished by a steady, non-flashing, white/gray/green glow. Scientific inquiry into milky seas has, for centuries, been held back by the remote ephemeral nature of this phenomenon. Combining centuries of eyewitness accounts with modern satellite observations, we present a curated list of milky sea observations since 1600. This database greatly expands the ability to study when and where milky seas occur, as well as the commonly observed features of a milky sea.
https://doi.org/10.5061/dryad.0gb5mkmbc
Description of the data and file structure
The data in this archive was collected for the paper “From Sailors to Satellites: A Curated Database of Milky Seas Since 1600”. By combining centuries of eyewitness accounts with satellite observations for the first time, we hope to expand the ability to study and understand milky seas.
Files and variables
File: Milky_Sea_Database.pdf
Description: A human readable PDF of every eyewitness account and satellite observation within the database. This file contains the date, location, description, and who reported the account for every milky sea observation. This is Supplemental 1 for the paper.
File: Milky_Sea_Database.tsv
Description: A machine-readable tab-separated values file containing detailed information on every milky sea observation within the database.
Variables
- Observation Start Date: The date in MM/DD/YYYY format for the start of the observation.
- Observation Start Hour: The hour of the day and the timezone if known for the start of the observation in HH:MM TMZ format. Unknown timezones are listed as UNK. Unknown hours are marked as (?).
- Observation End Date: The date in MM/DD/YYYY format for the end of the observation. Unknown end-dates are marked as (?)
- Observation End Hour: The hour of the day and the timezone if known for the end of the observation in HH:MM TMZ format. Unknown timezones are listed as UNK. Unknown hours are marked as (?).
- Approximate Lat: The approximate latitude of the observation in XX deg YY’ N/S format (e.g. 10 deg 30’ N).
- Approximate Lon: The approximate longitude of the observation in XX deg YY’ E/W format (e.g. 52 deg 30’ E).
- Observing Ship/Sensor: The ship or satellite sensor that the observation originates from, unknown ships are marked as (?).
- Observor(s): The observer of the observation. For eyewitness accounts, the names and ranks of the individuals who made the observation. For satellite observations, this is the paper where the observation was first reported. Unknown observers are marked as (?).
- Description: The eyewitness account transcribed from the original source or a statement that the observation comes from a satellite.
- Reported In: The source of the observation, whether it be personal correspondence, scientific publication, newspaper, etc.
- Approx Location: An approximate region of the world where the sighting/observation occurred.
- Phenomena: Says Milky Sea for all observations. Included in the case database is ever expanded to include other rare bioluminescent phenomena.
- Confidence: A coded number signifying the authors’ confidence that an observation is truly a milky sea, 0: High Confidence, 1: Low Confidence, 2: Very Low Confidence.
- Area KM2: For satellite observed events, the area of the milky sea event in square kilometers. For eyewitness accounts, N/A is used as a placeholder.
- Event Number: The number corresponding to which distinct milky sea event the observation belongs to when grouped together using 90 day and 8 degree radius thresholds as outlined in the paper.
Code/software
The PDF can be viewed with any PDF viewer. The .tsv file can be viewed in most text or code editors. Example code to open, process, and utilize the data within this file can be found at https://github.com/JustinRHudson/MilkySeaDatabase.
Access information
Other publicly accessible locations of the data: