Skip to main content
Dryad

A comparative evaluation of ChatGPT 3.5 and ChatGPT 4 in responses to selected genetics questions - Full study data

Data files

Jun 04, 2024 version files 122.70 KB

Abstract

Objective:

Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings.

Materials and Methods:

A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis.

Results:

ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE.

Discussion and Conclusion:

This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery.