Longitudinal trends of EHR concepts in pediatric patients
Data files
Jun 08, 2022 version files 1.51 MB
-
concept_lm_trends.csv
Abstract
The longitudinal nature of the data motivated temporal trend identification in the pediatric EHR datatypes. Over the past three decades (1980-2018), we identified and quantified the temporal trend of 16,460 EHR concepts across measurement, visit, diagnosis, drug, and procedure datatypes.
Methods
See the Methods of the associated JAMIA Open manuscript.
We defined trends for clinical concepts per EHR datatype per year across participants. We first calculated a standardized z-score (x-mu)/sd, where x was the percent of a concept or the number of participants with a recorded concept within a year and for a datatype out of the total number of concepts for that datatype, mu was the average percent across concepts for a year, and sd was the sample standard deviation of the percent across concepts. We then calculated a linear model to estimate the association of concept z-scores across time. We quantified the slope and R squared coefficient between the z-score and a year, across all years where EHR data was provided. This generated a beta coefficient for each year representing a trend in clinical concepts, relative to other EHR concepts, recorded in participant’s EHRs. Significant trends were defined as the linear model beta coefficient greater than 0, beta coefficient confidence interval not containing the null association, and R squared coefficient between the date and z scores greater than 0.8. We performed summarization, visualization, and statistical analyses using R packages including tidyverse and data.table and Python3 libraries Numpy, Matplotlib, Pandas, Sklearn, and Seaborn.
Usage notes
Field : Description
datatype : The OMOP-defined EHR domain.
concept_id : The OMOP-defined concept identifier.
concept_name : The OMOP-defined concept name.
lwr: The 95% lower bound of the odds ratio quantified by the linear model.
odds: The odds ratio of the EHR concept z-score across three decades (year units) quantified by the linear model.
upr: The 95% upper bound of the odds ratio quantified by the linear model.