Skip to main content

Disease seasonality estimation dataset for: Do psychiatric diseases follow annual cyclic seasonality?

Cite this dataset

Zhang, Hanxin (2021). Disease seasonality estimation dataset for: Do psychiatric diseases follow annual cyclic seasonality? [Dataset]. Dryad.


Seasonal affective disorder famously follows annual cycles, with incidence elevation in the fall and spring. Should some version of cyclic annual pattern be expected from other psychiatric disorders? Would annual cycles be similar for distinct psychiatric conditions? This study probes these questions using two very large datasets describing the health histories of 150 million unique Americans and the entire Swedish population. We performed two types of analysis, using “uncorrected” and “corrected” observation. The former analysis focused on counts of daily patient visits associated with each disease. The latter analysis instead looked at proportion of disease-specific visits within the total volume of visits for a time interval. In the uncorrected analysis, we found that psychiatric diseases’ annual patterns were remarkably similar across the studied diseases in both countries, with the magnitude of annual variation significantly higher in Sweden than in the US for psychiatric, but not infectious diseases. In the corrected analysis, only one group of patients – eleven to 20 years old – reproduced all regularities we observed for psychiatric disorders in the uncorrected analysis; the annual healthcare-seeking visit patterns associated with other age groups changed drastically. Analogous analyses over infectious diseases were less divergent over these two types of computation. Comparing two sets of results in context of published psychiatric disease seasonality studies, we tend to believe that our uncorrected results are likely to capture the real trends, while the corrected results reflect mostly artefacts determined by dominantly fluctuating health-seeking visits across year. In the spirit of full disclosure, we present both unredacted sets of results even-handedly and leave the verdict to the readers.

Usage notes

The data we used for analyses can be found in the data directory, organized by geographic regions (four high-latitude states, two low-latitude states, and the whole US). The data is packaged in a hierarchical python dictionary. Please refer to the project's web page for the usage: