Research Assistant Cohen Children's Medical Center
Background: Parents often seek advice on injury prevention strategies as they are confronted with various concerns. Alongside the traditional recourse to pediatricians for such information, the recent popularity of Artificial Intelligence (AI) chatbots has introduced a novel avenue for guidance through digital sources like Chat Generative Pre-trained Transformer (ChatGPT). However, there is little to no literature characterizing the parental guidance ChatGPT provides on pediatric injury prevention. Objective: This study aims to examine the potential differences between the accuracy and readability of injury prevention responses provided by pediatricians and ChatGPT. Design/Methods: Injury prevention questions and pediatrician responses (n = 12) were extracted from the official parenting website of the American Academy of Pediatrics (AAP). Questions were entered into ChatGPT version 3.5 with the prompt “How would you respond to a parent asking the following question: [insert question]?” Two Board-Certified pediatricians rated the medical correctness and completeness of randomized ChatGPT and pediatrician answers on a 5-point Likert scale for accuracy. Readability was evaluated using average word count, Flesch Reading Ease (FRE) score, and Flesch-Kincaid Grade Level (FKGL) (Table 1). Raters were also asked which response they preferred. Results: A total of 24 responses (12 from pediatricians and 12 from ChatGPT) were included and analyzed. While pediatrician (M = 4.92, SD = 0.29) responses had significantly higher completeness scores than ChatGPT (M = 4.08, SD = 0.79), there were no significant differences for medical correctness. ChatGPT-generated responses had significantly lower FRE score, higher FKGL, and lower word count (p < 0.001) compared to pediatrician responses.
Conclusion(s): While pediatrician and ChatGPT-generated responses were comparably medically correct, the former demonstrated a more comprehensive coverage of information. Despite having a shorter overall text length, the differences in readability metrics indicate that ChatGPT had less readable responses and tended to use more complex language. Parents seeking advice on injury prevention may encounter more easily digestible information on HealthyChildren.org, but the AAP should consider the use of medically-trained AI-chatbots to ease the process of finding information for parents. As ChatGPT becomes more prevalent, further research is warranted to explore strategies for enhancing the readability of its content so that more caregivers can access accurate information in a critical domain of child safety and care.