Trauma and Stressor Related Disorders and Disasters
Tell Me About Combat-Related PTSD! Comparing ChatGPT Output About Combat-Related PTSD to the ABCT Fact Sheet on Military PTSD
Sean A. Lauderdale, Ph.D.
Assistant Professor
University of Houston – Clear Lake
Richmond, Texas
Ray Daniel, M.S.
President
Non-profit
Corpus Christi, Texas
With advances in internet search capabilities, more people will rely on the internet to find mental health information. Recently, OpenAI released one of these advances, ChatGPT. Considered an artificial intelligence language model chatbot, ChatGPT delivers internet search results in a narrative mimicking human speech (Ramponi, 2022). Although ChatGpt was trained by human AI responders, it can generate inaccurate and biased results (Ramponi, 2022). Users may find the narrative responses influential if unaware of this. Another concern is that most internet searchers don’t practice digital media literacy (Stvilia et al., 2009) and trust health information provided by the internet (Pew Internet & American Life Project, 2006) and chatbots (Abd-Alrazaq et al., 2021). Many veterans also rely on internet searches for health and mental health conditions (McInnes et al., 2010) and may be at risk for discovering believable but inaccurate information provided by AI.
For this investigation, we compare ChatGpt’s output about combat-related posttraumatic stress disorder (CR-PTSD) to an expert-written, peer-reviewed, online fact sheet published by the Association of Behavioral and Cognitive Therapies (ABCT) using text analysis tools to assess narrative content and characteristics affecting readability and accuracy. We specifically compared the total ChatGPT output to the total text of the ABCT fact sheet. We also compared individual sections of the fact sheet (e.g., “What is trauma?”) to queries answered by ChatGPT. Content will also be evaluated by CR-PTSD knowledgeable raters masked to the source using DISCERN (Charnock et al., 1999; Grohol et al., 2013), a validated measure for rating quality of internet mental health information.
Specific statistics used included the similarity cosine, which is a measure of document similarity based on the cosine of word frequency vectors between documents. The similarity cosine ranges from 0 to 1 and higher values indicate greater similarity. The non-parametric Mann-Whitney U was used to compare median differences in vocabulary density (ratio of unique vocabulary words to total vocabulary words), readability index (Coleman-Liau; grade level required to comprehend text), number of positive words, and number of negative words.
The total ABCT fact sheet (total words = 3633) and ChatGpt (total words = 2309) output similarity cosine indicated moderate similarity (0.66). There were no differences in vocabulary density (Z = 1.35, p > .05), number of positive words (Z = 0.69, p > .05), or number of negative words (Z = 0.53, p > .05). The ABCT fact sheet had a lower reading level (mean 10.82 years of education) compared to ChatGPT’s output (mean 14.84 years of education; Z = 3.18, p < .01), indicating that the ABCT fact sheet would be more accessible to veterans.
Preliminary findings suggest that ChatGpt output about CR-PTSD is similar to the ABCT fact sheet; however, there were some differences. The ABCT fact sheet focused more on military and veteran related themes (e.g., security clearance and moral injury) than ChatGPT. In contrast, ChatGPT mentioned more treatment options (e.g., medications) but promoted avoidance. Ongoing analysis may reveal other differences in accuracy, relevance, and treatment coverage.