Human AI Guidelines for Mental Health

Case Studies

A closer peak into the heauristic analysis we conducted on selected mental health chatbots to validate our guidelines

Selection Criteria

We selected 12 popular mental healthcare mobile apps from the PlayStore, focusing on those offering therapy chatbot services. Out of these, 5 were shortlisted for our study.

Apps such as Anima and Antar, while popular, were not included as they primarily focus on providing companionship and journaling features rather than therapeutic resources.

Similarly, apps like Woebot and Ginger by Headspace were not included due to limitations in accessing free services. The final list of shortlisted apps is as follows:

Wysa

Mental Health Support

Youper

AI For Mental Health Care

VOS.Health

Mental Health App

Audyn

AI Mental Health

Sintelly

CBT Therapy Chatbot

Comparison Matrix

View the spreadsheet summarizing the results of our heuristic audit

Evaluation Findings

Clarity of Capabilities

Wysa and Youper offer some explanation on using CBT/DBT but overall little information on reasoning processes, limitations and capabilities.

Reliability and Accuracy

Mostly, no insight on the bot’s reliability and accuracy for its functionality. Audyn addresses the potential for misunderstood responses and clarifies its role as primarily a text generator.

Contextual Relevance

Limited context for recent messages. Sintelly completely skips over the user’s response and reacts insensitively and irresponsibly with “congratulations” to someone experiencing depression.

Bias Mitigation

Mostly indifferent/ ‘neutral’ responses reinforcing biases or refusing to engage in one instance.

Scope of Services

Wysa, Youper and Sintelly recommend external sources for emergencies outside of their purview, but Audyn and VOS don’t. More vetting and information needed for resources.

Learning and Adaptation

Limited information for testing.

Match relevant social norms

Audyn and Wysa have limited emotionality (e.g. happy or sad faces). Wysa uses visual aids to explain phenomena (e.g. breathing exercises in the image).

However, Audyn uses too many emojis which could make users feel like they’re not being taken seriously.

Feedback and Consequences

Absence of feedback mechanisms, except for customer ratings.

Global Controls

Lack of user control and data transparency, which can endanger the safety of individuals (esp. People of color, or people with autism and schizophrenia, etc.).

Case Studies

A closer peak into the heauristic analysis we conducted on selected mental health chatbots to validate our guidelines

Selection Criteria

Comparison Matrix

View the spreadsheet summarizing the results of our heuristic audit

Evaluation Findings

Clarity of Capabilities

Wysa and Youper offer some explanation on using CBT/DBT but overall little information on reasoning processes, limitations and capabilities.

Reliability and Accuracy

Mostly, no insight on the bot’s reliability and accuracy for its functionality. Audyn addresses the potential for misunderstood responses and clarifies its role as primarily a text generator.

Contextual Relevance

Limited context for recent messages. Sintelly completely skips over the user’s response and reacts insensitively and irresponsibly with “congratulations” to someone experiencing depression.

Bias Mitigation

Mostly indifferent/ ‘neutral’ responses reinforcing biases or refusing to engage in one instance.

Scope of Services

Wysa, Youper and Sintelly recommend external sources for emergencies outside of their purview, but Audyn and VOS don’t. More vetting and information needed for resources.

Learning and Adaptation

Limited information for testing.

Match relevant social norms

Audyn and Wysa have limited emotionality (e.g. happy or sad faces). Wysa uses visual aids to explain phenomena (e.g. breathing exercises in the image).However, Audyn uses too many emojis which could make users feel like they’re not being taken seriously.

Feedback and Consequences

Absence of feedback mechanisms, except for customer ratings.

Global Controls

Lack of user control and data transparency, which can endanger the safety of individuals (esp. People of color, or people with autism and schizophrenia, etc.).

© 2024 HAI Guidelines for Mental Health Chatbots

Audyn and Wysa have limited emotionality (e.g. happy or sad faces). Wysa uses visual aids to explain phenomena (e.g. breathing exercises in the image).

However, Audyn uses too many emojis which could make users feel like they’re not being taken seriously.