This study aimed to evaluate and compare the performance of multiple large language models (LLMs): GPT-4, GPT-4o, GPT-4o-mini, Gemini 1.5 Pro, and Claude 3.5 Sonnet, in classifying cognitive triad parameters (self-negative, world-negative, and future-negative) in the context of patient- chatbot interactions. Using Beck’s Cognitive Theory as a framework, the models were tested on their ability to categorize patient statements that exhibit negative thought patterns. Our results indicate that GPT-4 outperformed other models across all cognitive triad categories, showing the highest accuracy, precision, recall, and F1-scores. Gemini and GPT-4o demonstrated strong performances as well, though slightly behind GPT-4. Claude and GPT-4o-mini performed well but showed lower classification results in comparison. These results indicate that the most recent large language models, especially GPT-4, show substantial capability in detecting and categorizing pessimistic thinking patterns linked to depression within dialogue-based contexts. The comparative evaluation reveals that specialized training methods and carefully crafted prompting strategies, combined with the models’ inherent strengths, contribute meaningfully to enhanced performance on specialized applications like psychological assessments. Subsequent investigations might benefit from broadening the dataset to encompass more diverse patient conversations and conducting deeper analysis of how well the model recognizes subtle cognitive characteristics. Moreover, applying advanced training refinement approaches could yield additional improvements in model effectiveness, increasing their viability for practical mental health care uses
1A. T. Beck, Depression: Clinical, Experimental, and Theoretical Aspects. New York, NY, USA: Harper & Row, 1967.
2A. T. Beck, A. J. Rush, B. F. Shaw, and G. Emery, Cognitive Therapy of Depression. New York, NY, USA: Guilford Press, 1979.
3D. A. Clark and A. T. Beck, Cognitive Theory and Therapy of Anxiety and Depression. New York, NY, USA: Guilford Press, 2010.
4T. B. Brown et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems (NeurIPS), 2020.
5J. S. Beck, Cognitive Behavior Therapy: Basics and Beyond, 2nd ed. New York, NY, USA: Guilford Press, 2011.
6K. K. Fitzpatrick, A. Darcy, and M. Vierhile, “Delivering cognitive behavior therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot),” JMIR Mental Health, vol. 4, no. 2, 2017.
7C. Hollis et al., “Digital health interventions for mental health: A systematic review,” British Journal of Psychiatry, 2020.
8S. Feng et al., “Cognitive diagnosis with large language models,” in Proceedings of the ACL Workshop on Natural Language Processing for Mental Health, 2022.
9R. Shan et al., “Effectiveness of AI-based conversational agents in mental health interventions,” Journal of Medical Internet Research (JMIR), 2020.
10A. Author et al., “Comparative evaluation of large language models for psychiatry applications,” Healthcare (MDPI), 2025.
11T. B. Brown et al., “GPT-style large language models for natural language understanding,” 2020.
12J. Devlin, M.-W. Chang, K. L ee, and K. Toutanova, “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proc. NAACL-HLT, 2019.
13J. Lee et al., “BioBERT: A pre-trained biomedical language representation model for biomedical text mining,” Bioinformatics, vol. 36, no. 4, 2020.
14S. Reddy et al., “Artificial intelligence in clinical decision making,” Journal of Clinical Medicine, 2020.
15S. Feng et al., “Automatic detection of cognitive distortions in mental health text,” in Proc. ACL, 2022.
16 Y. Zhou et al., “Depression recognition using large language models,” 2025.
17P. Patel et al., “Reliability and bias in large language models for healthcare,” 2025.
18K. K. Fitzpatrick et al., “Woebot: A conversational agent for mental health,” JMIR Mental Health, 2017.
19R. Shan et al., “Wysa: An AI-based mental health conversational agent,” JMIR, 2020.
20J. S. Beck, “Core techniques in cognitive behavior therapy,” 2011.
21Y. Liu et al., “Cognitive feature extraction for mental health assessment using NLP,” 2021.
22C. Hollis et al., “Scalability of digital cognitive behavioral therapy interventions,” British Journal of Psychiatry, 2020.
23A. S. Miner et al., “Large language models in mental health care: Opportunities and risks,” 2024.
24E. M. Bender et al., “On the dangers of stochastic parrots: Ethical concerns in large language models,” in Proc. ACM FAccT, 2021.
25R. Natarajan, S. Krishna and C. P. Ranjith, "A Novel Federated Learning Framework for Healthcare Applications Using Wearable Devices," 2025 IEEE 4th International Conference on AI in Cybersecurity (ICAIC), Houston, TX, USA, 2025, pp. 1-6, doi: 10.1109/ICAIC63015.2025.10848974.
26Jere, Shreekant & Patil, Annapurna. (2022). Aspect-Based Sentiment Classification for Detecting the Cognitive Triad Mechanism of Depression. Journal of Computer Science. 18. 1144-1158. 10.3844/jcssp.2022.1144.1158.