Skip to main content
Dryad

Data-driven predictions of summertime visits to lakes across 17 US states

Abstract

Using a dataset of more than 51,000 US lakes, we estimated the relationship between summertime lake visits, lake water quality, landscape features, and other amenities, where visitation was estimated using counts of geolocated photographs. Given the size and complexity of our dataset, we used a combination of machine learning techniques, imputation techniques, and a Poisson count model to estimate these relationships. We found that every additional meter of average summer-time Secchi depth was associated with at least 7% more summer-time lake visits, all else equal. Second, we found that lake amenities, such as beaches, boat launches, and public toilets, were more powerful predictors of visits than water quality. Third, we found that visits to a lake were strongly influenced by the lake’s accessibility and its distance to nearby lakes and the amenities the nearby lakes offered. Our research highlights the need for 1) a better understanding of how representative social media data are of actual recreational behavior, 2) the development of best practices to account for non-random patterns in missing natural feature data, and 3) a better understanding of the potential reverse causality in the lake visit-water quality relationships.