LyX Document

1 Introduction

Depression represents a widespread mental health disorder impacting countless individuals globally, frequently resulting in severely disruptive consequences for people’s everyday functioning. A fundamental characteristic of depression involves the distorted thinking patterns that sufferers encounter, typically manifested through pessimistic perspectives regarding oneself, one’s environment, and future prospects. According to Beck’s Cognitive Theory [1], these mental frameworks serve as essential indicators for both diagnostic purposes and evaluating the intensity of depressive disorders. Recognizing and treating these thinking distortions constitutes a vital component of successful therapeutic approaches; nevertheless, traditional evaluation methods tend to be labor-intensive and prone to personal bias [3].

The rapid progress in natural language processing and large language models has created fresh opportunities for streamlining and enhancing how mental health professionals identify and manage conditions like depression. Companies such as OpenAI, Google DeepMind, and Anthropic have built language models that show strong ability to interpret and make sense of human language, which positions them well for spotting distorted thinking patterns within patient dialogues [4]. Researchers are now exploring how these models can be applied to conversations between patients and AI-powered chatbots, with the goal of sorting out harmful or negative thought patterns and flagging them for review.

This approach holds real potential as both an early screening tool for depression and a way to keep track of a patient’s mental state over time[6].

This paper takes a closer look at how several leading LLMs stack up against one another when it comes to a specific task in mental health classifying statements that reflect the cognitive triad, which includes negative thinking about oneself, the world, and the future. The models under examination include GPT-4, GPT-4o, GPT-4o-mini, Gemini, and Claude, developed by OpenAI, Google DeepMind, and Anthropic respectively. The framework guiding this evaluation is Beck’s Cognitive Theory, which provides a structured way to understand how depression manifests through thought patterns [8]. By testing how accurately these models can sort patient statements from chatbot conversations into these three categories, the study aims to lay groundwork for building AI tools that could genuinely support mental health care. Ultimately, the findings are expected to add to the broader conversation around whether LLMs are ready to assist therapists and clinicians in both identifying and addressing depression [10].

2 Literature Review

2.1 Cognitive Theory in Healthcare Contexts

Understanding how people think, feel, and act has been central to psychology and medical practice for decades, with cognitive behavioral frameworks serving as essential tools in this work [2]. Mental health professionals rely on these frameworks to grasp how patients make sense of their experiences and respond to different situations, insight that directly shapes diagnosis and treatment approaches [3]. In recent years, researchers and clinicians have begun exploring how these same principles might enhance digital healthcare tools, particularly in the development of conversational AI systems designed to support patients [7].

Beck’s Cognitive Triad [1, 26]offers a framework for examining how individuals develop pessimistic views regarding themselves, their environment, and what lies ahead, a pattern frequently observed when evaluating conditions such as depression [2]. Beyond this foundational concept, other elements from cognitive theory play significant roles in identifying and understanding distorted thinking patterns that can fuel psychological disorders. These include the formation of negative core beliefs about oneself and the tendency to process information in biased or inaccurate ways [2, 3]. The landscape has evolved in recent years, with cognitive behavioral therapy (CBT) principles now being incorporated into digital platforms, including chatbot applications that provide therapeutic support to people experiencing mental health challenges [5].

The growing adoption of chatbots in medical settings highlights the need to detect and address cognitive elements during these digital exchanges including patterns linked to the cognitive triad, core beliefs about oneself, and distorted ways of interpreting information [9]. When these problematic thinking patterns are recognized within chatbot conversations, healthcare providers gain valuable opportunities to enhance diagnostic accuracy, refine treatment strategies, and offer more effective emotional care for mental health patients [5, 7].

2.2 Language Models in Healthcare

Recent years have witnessed substantial progress in natural language processing (NLP) driven by large language models (LLMs) like OpenAI’s GPT series [11] and Google’s BERT [12]. Built on transformer architecture, these models possess the capability to examine complete text sequences simultaneously, allowing for language analysis at levels of complexity previously unattainable. This technological advancement has opened doors for diverse healthcare uses, from condensing medical literature [13] to aiding physicians in clinical decision-making processes [14].

When applied to chatbot-patient conversations, LLMs offer possibilities for evaluating cognitive elements through their capacity to parse and comprehend the subtleties of human dialogue. Because these models can grasp contextual meaning and recognize linguistic patterns, they show particular promise in spotting distorted thinking, harmful core beliefs, and skewed interpretation of information, all critical factors in psychological evaluation [15]. To illustrate, LLMs may detect recurring themes in patients’ self-descriptions, their perceptions of surrounding circumstances, and their expectations for what’s to come, potentially revealing signs of conditions like depression or anxiety [16].

Nevertheless, rigorous assessment of how well LLMs can accurately identify and categorize these cognitive elements in actual clinical practice remains necessary. Despite the encouraging prospects, several obstacles including concerns about patient confidentiality, algorithmic prejudices, and constraints specific to medical contexts that require resolution before such technologies can be comprehensively adopted in psychological and psychiatric treatment [17].

2.3 Chatbot-Based Cognitive Behavioral Therapy (CBT)

Considerable research attention has been directed toward chatbot systems built to provide cognitive behavioral therapy (CBT)-based support. Platforms like Woebot [18] and Wysa [19] employ conversational AI technology to involve users in exercises aimed at restructuring their thought processes. These applications are constructed to recognize unhelpful thinking habits, including catastrophizing, making sweeping generalizations, and persistent negative rumination and respond with strategies grounded in CBT methodology [20]. Through ongoing dialogue, for example, these platforms guide users in pinpointing and questioning their distorted beliefs, reshaping pessimistic thought patterns, and cultivating more adaptive ways of thinking [20].

Research indicates that large language models (LLMs) can be embedded within these platforms to categorize cognitive elements including distorted reasoning, affective reactions, and patterns of behavior [21]. By incorporating LLMs, chatbots gain improved capacity to interpret and address sophisticated linguistic subtleties, strengthening their potential to identify cognitive distortions as conversations unfold. Work by Hollis et al. [22] demonstrated that CBT interventions delivered through chatbots could serve as an accessible and expandable option compared to conventional therapeutic approaches, establishing a basis for examining cognitive theories in digital settings. That said, obstacles persist in enhancing the accuracy and contextual comprehension of these models, especially when categorizing intricate cognitive elements across varied patient groups [23, 25].

Beyond the technical hurdles, ethical questions arise concerning the protection of personal information and secure data handling, potential prejudices embedded in AI algorithms, and difficulties maintaining user participation [24]. Even with these limitations, CBT delivered through chatbot platforms holds considerable promise for bridging the widening divide in mental health service availability by providing interventions that are reachable, affordable, and capable of expansion, especially for communities with limited access to care [22].

2.4 Large Language Models for Cognitive Parameter Classification

Contemporary research has examined how large language models (LLMs) might categorize cognitive elements within chatbot-patient conversations. Feng et al. [15] explored the training of LLMs to detect distorted thinking in patient statements throughout cognitive behavioral therapy (CBT) sessions. Their findings suggested that LLMs demonstrate capability in spotting frequent maladaptive thought patterns, including catastrophizing and sweeping generalizations [15].

However, they also highlighted challenges in distinguishing between subtle variations in these distortions, such as differentiating between rumination and repetitive negative thinking, which are often observed in patients with depression [15].

In the context of mental health, Liu et al. [21] evaluated the use of BERT for extracting psychological features from patient responses in a chatbot-based mental health assessment tool. Their findings suggest that LLMs can accurately identify emotional tone and cognitive biases in text [13]. However, they noted that further fine-tuning is required to ensure these models achieve high accuracy and reliability, particularly in clinical settings where precision is critical [17, 21].

In another study, Zhou et al. [16] demonstrated the potential of GPT-3–based models in identifying cognitive–behavioral symptoms in patients with depression. By fine-tuning the model to classify patient responses into categories such as negative self-schema and catastrophic thinking, they showed that LLMs could effectively support the assessment process [16]. However, they also pointed to limitations related to the models’ ability to generalize across diverse patient populations, particularly when trained on data from homogenous groups [15, 16].

Although LLMs have demonstrated considerable promise for categorizing cognitive elements, numerous obstacles still exist. Problems involving the reliability of training data, algorithmic prejudices, and ethical dilemmas regarding the confidentiality of patient information must be resolved prior to widespread implementation in clinical practice [15, 17, 24]. Moreover, the black-box nature of these models makes them difficult to interpret, posing a significant barrier to their widespread use in clinical practice [18]. Future research should focus on improving model transparency, expanding training datasets to better represent diverse populations, and enhancing the precision of these models for nuanced cognitive assessments [15, 17].

3 Proposed Methodology

A. Objective

The objective of this study is to compare the performance of different large language models (LLMs) in classifying cognitive triad parameters (self-negative, world-negative, and future-negative) in the context of patient-chatbot interactions based on Beck’s Cognitive Theory. Specifically, the study aims to evaluate the ability of LLMs to categorize real-world patient statements as being negative with respect to self, the world and the future.

B. Data Collection

Conversations Dataset: For this study, we will use 20 real-life patient-chatbot conversations (detailed in the previous steps). Each conversation consists of dialogues between a patient and a counselor or chatbot. The conversations contain statements that the models must classify into one of the three categories:
Self-negative: Statements expressing negative thoughts about oneself (e.g., ”I’m not good enough”).
World-negative: Statements expressing negativity toward the world, others, or external circumstances (e.g., ”No one cares about me”).
Future-negative: Statements expressing negative thoughts about the future (e.g., ”Things will never improve”).

The dataset will be split into individual patient statements, and each statement will be manually labeled according to its cognitive triad category.

C. Prompt Design and Inference Strategy

To classify patient statements according to Beck’s Cognitive Triad, a structured, instruction-based prompting strategy is employed. Separate prompts are designed for each cognitive category: self-negative, world-negative, and future-negative, to ensure conceptual clarity and reduce inter-class ambiguity.

Each prompt positions the large language model as a psychology expert and provides explicit definitions and decision rules for the target cognitive mechanism. This role-based framing encourages clinically grounded reasoning while maintaining consistency across model evaluations. Table 1 explains about Prompt Design for cognitive triad classification. This table gives the clarity about the key identification criteria, exclusion rules and output format. Table 2 explains about Sample Patient Statements and Corresponding Cognitive Triad Classification Outcomes.

For each conversation, patient utterances are concatenated and provided as input text. The input is explicitly delimited to avoid prompt leakage and ensure clear contextual boundaries. Speaker labels (e.g., Patient, Counselor) are retained in the input for contextual understanding but are removed from the model output.

Table 1: Prompt Design for cognitive triad classification

Cognitive Category	Prompt Role Definition	Key Identification Criteria	Exclusion Rules	Output Format
Self-negative	Model is instructed to act as a psychology expert analyzing patient text for self-directed negativity	-Negative beliefs about oneself(eg. Inadequacy, failure, low-self-worth) -Explicit self-referential expressions	-Exclude future- oriented statements (eg: “I will fail”) -Exclude negativity toward others or external situations	JSON object with field self negative containing a list of extracted statements.
World-negative	Model is instructed to identify negative perceptions of the external world and others.	Negative statements about people, relationships or society (eg., friends, family, colleagues) -External attribution of blame or hostility	-Exclude self-directed negativity. -Exclude future oriented pessimism	JSON object with field world negative containing a list of extracted statements.
Future-negative	Model is instructed to analyze pessimistic beliefs about future outcomes.	-Hopeless or pessimistic expectations about the future -Future-tense- self-referential statement	-Exclude present and past tense statements. -Exclude negativity about others.	JSON object with field future-negative containing a list of extracted statements.

Table 2: Sample Patient Statements and Corresponding Cognitive Triad Classification Outcomes

Example ID	Patient Statement	Predicted Category	Rationale (Brief)
E1	I feel like I am not good enough no matter how hard I try.	Self Negative	Explicit negative self-evaluation and low self-worth
E2	Everyone around me ignores my problems.	World Negative	Negative attribution toward others and social environment
E3	Nothing is ever going to get better for me.	Future Negative	Pessimistic expectation about future outcomes
E4	I always mess things up and disappoint myself.	Self Negative	Self-directed blame and personal inadequacy
E5	People only care about themselves and not about me.	World Negative	Negative perception of others’ intentions
E6	I don’t see any hope for my life in the coming years.	Future Negative	Future-oriented hopelessness
E7	My colleagues never support me when I need help.	World Negative	External attribution of blame
E8	I am a failure and I can’t do anything right.	Self Negative	Strong negative self-concept
E9	Things will never change, no matter what I do.	Future Negative	Absolute pessimism about the future

D. Model Selection

Large Language Models (LLMs): The models that will be evaluated in this study include:

• GPT-4 (OpenAI)

• GPT-4o (OpenAI)

• GPT-4o-mini (OpenAI)

• Gemini (Google DeepMind)

• Claude (Anthropic)

These models will be evaluated on their ability to classify each patient statement into one of the cognitive triad categories. (self-negative, world-negative, and future-negative).

E. Architecture

The architecture (Fig 1) for classifying cognitive triad parameters in patient-chatbot interactions involves processing the patient’s input, typically a text file containing conversational data. This input is preprocessed by normalizing and tokenizing the text, followed by sending it through structured prompts designed to classify the statements into three categories: selfnegative, world-negative and future-negative. Each category is handled by dedicated functions (identify self negative(), identify world negative(), identify future negative()) that interact with LLM model via the API, using zero-shot learning. The model returns classified statements in JSON format, which are then used for further evaluation, helping analyze the patient’s cognitive triad based on Beck’s Cognitive Theory.

Figure 1: Pipeline Architecture for Prompt-Based Cognitive Triad Classification

The proposed pipeline implements a sequential prompt-based inference workflow for identifying cognitive triad categories from patient textual data. The architecture decomposes the classification task into three independent prompt-driven stages, each dedicated to one cognitive dimension: self-negative, world-negative and future-negative. This modular design enables fine-grained control over classification logic and improves interpretability by isolating each cognitive mechanism.

Pipeline Stages are explained as below:

1. Input Acquisition

The system accepts a text file containing patient–chatbot conversation transcripts.
The entire file is read and treated as a single input document.
No pre-filtering or linguistic normalization is applied at this stage to preserve original linguistic cues.

2. Self-Negative Identification Module

The input text is passed to a self-negative prompt module.
A category-specific prompt instructs the LLM to extract statements expressing negative self-perception.
The model returns a JSON-formatted list of self-negative statements.

3. World-Negative Identification Module

The same input text is independently processed by the world-negative prompt module.
This module focuses on statements reflecting negative perceptions of others or the external world.
Outputs are again constrained to a JSON list format.

4. Future-Negative Identification Module

The input text is forwarded to the future-negative prompt module.
The model extracts statements expressing pessimism or hopelessness about future outcomes.
Only future-oriented self-referential statements are retained.

5. Output Aggregation and Logging

Outputs from the three modules are aggregated into separate lists:

Self-negative statements
World-negative statements
Future-negative statements

The results are logged for downstream evaluation and comparison with manually annotated ground truth labels.

S.No	Model	Accuracy	Precision	Recall	F1-Score
1	GPT-4	92.1	0.923	0.917	0.920
2	GPT-4o	93.4	0.936	0.929	0.932
3	GPT-4o-mini	88.7	0.892	0.881	0.886
4	Gemini 1.5 Pro	90.9	0.911	0.904	0.907
5	Claude 3.5 Sonnet	91.6	0.918	0.912	0.915

S.No	Model	Self- Negative	World- Negative	Future- Negative	Macro F1
1	GPT-4	0.94	0.90	0.92	0.920
2	GPT-4o	0.95	0.91	0.94	0.932
3	GPT-4o-mini	0.90	0.86	0.90	0.886
4	Gemini 1.5 Pro	0.92	0.89	0.91	0.907
5	Claude 3.5 Sonnet	0.93	0.90	0.92	0.915