MIT researchers teach AI to spot depression

Researchers used machine learning to build a neural network to recognise the signs of depression in speech and text

Robot psychologist

Researchers at MIT have created a neural network that can be used to spot the signs of depression in human speech.

In a paper being presented at the Interspeech Conference, the researchers detail a neural-network model that can be unleashed on raw text and audio data from interviews to discover speech patterns indicative of depression.

Advertisement - Article continues below

"The first hints we have that a person is happy, excited, sad, or has some serious cognitive condition, such as depression, is through their speech," says first author Tuka Alhanai, a researcher in the Computer Science and Artificial Intelligence Laboratory.

It is so advanced, the researchers say that given a new subject, it can accurately predict if the individual is depressed, without needing any other information about the questions and answers.

"If you want to deploy depression-detection models in a scalable way, you want to minimize the number of constraints you have on the data you're using. You want to deploy it in any regular conversation and have the model pick up, from the natural interaction, the state of the individual," said Alhanai. 

It is hoped this method has the potential to be developed as a tool to detect the signs of depression in natural conversation, such as a mobile app that monitors a user's text and voice for mental distress and send alerts.

The researchers' model was trained and tested on a dataset of 142 interactions from audio, text, and video interviews of patients with mental-health issues and virtual agents controlled by humans.

Advertisement
Advertisement - Article continues below
Advertisement - Article continues below

Each subject was scored in terms of depression on a scale between 0 to 27, using a personal health questionnaire. Scores between 10 to 14 were considered moderate and those between 15 to 19 were considered depressed, while all others below that threshold were considered not depressed. Out of all the subjects in the dataset, 20% were labelled as depressed.

A key insight from the research was that during experiments, the model needed much more data to predict depression from audio than it did text. With text, the model accurately detects depression using an average of seven question-answer sequences. Whereas with audio, the model needed around 30 sequences.

"That implies that the patterns in words people use that are predictive of depression happen in shorter time span in text than in audio," Alhanai added.

Featured Resources

Top 5 challenges of migrating applications to the cloud

Explore how VMware Cloud on AWS helps to address common cloud migration challenges

Download now

3 reasons why now is the time to rethink your network

Changing requirements call for new solutions

Download now

All-flash buyer’s guide

Tips for evaluating Solid-State Arrays

Download now

Enabling enterprise machine and deep learning with intelligent storage

The power of AI can only be realised through efficient and performant delivery of data

Download now
Advertisement

Most Popular

Visit/security/privacy/355155/zoom-kills-facebook-integration-after-data-transfer-backlash
privacy

Zoom kills Facebook integration after data transfer backlash

30 Mar 2020
Visit/security/data-breaches/355173/marriott-hit-by-data-breach-exposing-personal-data-of-52-million
data breaches

Marriott data breach exposes personal data of 5.2 million guests

31 Mar 2020
Visit/security/cyber-crime/355171/fbi-warns-of-zoom-bombing-hackers-amidst-coronavirus-usage-spike
cyber crime

FBI warns of ‘Zoom-bombing’ hackers amid coronavirus usage spike

31 Mar 2020
Visit/data-insights/data-management/355170/oracle-cloud-courses-are-free-during-coronavirus-lockdown
data management

Oracle cloud courses are free during coronavirus lockdown

31 Mar 2020