Data from: Characterizing terminology applied by authors and database producers to informatics literature on consumer engagement with wearable devices
Data files
May 23, 2023 version files 1.19 MB
Abstract
To recommend strategies to improve discoverability of consumer health informatics (CHI) literature, we aimed to characterize controlled vocabulary and author terminology applied to a subset of CHI literature on wearable technologies. A descriptive analysis of articles (N=2,522) from 2019 identified 308 (12.2%) CHI-related articles for which the citations with PubMed identifiers for the included and excluded studies are provided. The 308 articles were published in 181 journals which we classified by type of journal—health, informatics, technology and other—as shown in the third file. We provide an aggregated file of the author-assigned keywords as they appeared in the PubMed records of the included studies along with our decision about whether they represented consumer engagement. We also included an aggregated file of the Medical Subject Headings assigned to the included studies. The top 100 terms and their frequency scores for the title and abstracts are also included. We did not include any of the terminology from CINAHL, and Engineering Databases (Compendex and Inspec together) due to copyright concerns.
Methods
This data set includes 14 files, 7 Microsoft Excel (.xlsx) and those same 7 files as comma-delimited CSV files.
We searched PubMed on December 19, 2020 using the strategy published in the associated article, limiting to the publication year 2019, and retrieved 2,522 citations for the feasibility study and uploaded them to Rayyan.ai for independent double screening by four reviewers with CHI expertise (CAS, KA, CM, SS). All 2,522 abstracts were divided equally across the team of four reviewers. Each reviewer independently screened 1261 abstracts (resulting in each abstract being reviewed by two reviewers); then, discussion and consensus were used as needed, with a third reviewer making the final decision. The inclusion and exclusion criteria that resulted in this data set appear below and also as a table in the published article.
Final Inclusion and Exclusion Criteria Applied in Screening and Selecting Articles for the Terminology Analysis
Inclusion |
Exclusion |
At least one device in the article meets the following criteria, as identified by the screener but not necessarily stated explicitly by the author(s): |
All devices in the article meet at least one of the following criteria, as identified by the screener, but not necessarily stated explicitly by the author(s): |
The article is not solely a product advertisement or announcement and has an abstract or other substantive content in English. The article itself can be in a language other than English. |
Article is solely a product advertisement or announcement. |
The device is consumer/patient-focused and can be worn and removed. |
A device that is implanted or designed as part of a larger clinical medical device or system that is not typically available to consumers. |
The wearable device measures a health or physiological characteristic relevant to health or well-being. |
A device that does not measure anything relevant to health. |
The consumer/patient can observe the device’s data. |
Monitoring or data that is generated is not available to the consumer/patient. |
The files 1 and 2 contain a subset of the citation data exported from the Rayyan system.
File 3 was created by two authors (KA and RW) categorizing source journals into 4 groups: health, informatics (based on Wang et al.’s core journal list), technology, and other. The “health” category includes journals covering health topics exclusive to informatics or technology. Journals that did not focus on health, informatics or technology were categorized as “Other;” examples included Systematic Reviews, and Evaluation and Program Planning.
Files 4 and 7, counting terms as single words or pre-existing phrases in author keywords and MeSH, were created by removing spaces in phrases to create a single entity and counted those with a unique word calculator, PlanetCalc. Unique words count | Online calculators. https://planetcalc.com/3205/
Term frequencies for Files 5 and 6 were created by using the Monkey Learn automated word cloud generator that uses artificial intelligence to identify multi-word concepts and remove common stop words. MonkeyLearn. Word Cloud Generator. https://monkeylearn.com/word-cloud
Usage notes
This data set consists of Microsoft Excel (.xlsx) and comma-delimited CSV files for the seven supplementary data files published as PDFs with our article.
Supplemental Data File 1. Citations for 308 included papers from 2019
Supplemental Data File 2: Citations for 2214 excluded papers from 2019
Supplemental Data File 3: Journal category analysis
Supplemental Data File 4: Counts for the 855 unique author keywords
Supplemental Data File 5: Frequency data for the word clouds top 100 generated from titles.
Supplemental Data File 6: Frequency data for the word clouds top 100 generated from abstracts.
Supplemental Data File 7: Counts of unique MeSH terms.
Any program that can open comma-delimited files, such as Microsoft Excel or other spreadsheet programs can be used. The secondary copies in .xlsx format require Microsoft Excel.
The first several rows of each data table contain the title and description and definitions for that table, so please open the file and examine those and delete them or take them into consideration when sorting or importing the data into other programs.
There is no missing data in the data tables. In Supplementary Data 3, those titles that were not noted as Informatics in the cited article have no in that column. In Supplementary Data 4, those terms that were not classified as consumer engagement now have a dot (.) in the consumer engagement field.