- Obstacles to diagnosis prevent many depressed people from seeking treatment
- Neural network detects signs of depression from natural conversations
- Model could someday power apps that monitor mood and alert users
Some medical conditions are easy to diagnose. Measles give you a rash. Chest pain might indicate a heart attack. But the symptoms of depression can be harder to pinpoint.
The brief form asks patients about their level of interest in regular activities, appetite and eating habits, ability to concentrate, and other queries designed to detect depression. The questions are based on criteria from the Diagnostic and Statistical Manual of Mental Disorders (DSM) IV.
Once diagnosed, depression can be treated. But, according to Tuka Al Hanai, obstacles such as cost, mobility, and motivation may prevent depressed people from seeking the help they need.
Al Hanai is a researcher in the Computer Science and Artificial Intelligence Laboratory (CSAIL) at the Massachusetts Institute of Technology (MIT). She and her colleagues wanted to know if an algorithm could be developed to evaluate a person’s mental health through speech and language cues. They hope that the availability of an automated screening process might reduce barriers to diagnosis and treatment.
In a recent paper, Al Hanai and her team describe a study aimed at detecting depression using a Long-Short Term Memory (LSTM) neural network to model audio and text transcriptions extracted from The Distress Analysis Interview Corpus (DAIC).
The DAIC consists of interactions between human subjects and a virtual agent. 142 individuals were screened for depression through a human-controlled virtual agent. The agent asked questions such as ‘How are you?’, and ‘Do you consider yourself to be an introvert?’ It gave feedback to the subject using natural responses like, ‘I see,’ and ‘Tell me more about that.’
In previous attempts to automate the depression screening process, two methods have been pursued. The first uses specific questions, such as ‘Do you have a history of depression?’ to model a subject’s outcome. The second models outcomes based on responses not related to the questions asked (e.g., speaking rate).
Because depression screening is a question-answer process, the CSAIL researchers wanted to model depression using sequences of responses rather than questions contrived to yield a particular response. Al Hanai says that an automated depression-detection model can be deployed in a scalable way if it can take cues from natural human interactions.
The group ran three experiments. The first used questions that were not necessarily meant to elicit a response about the subject’s mood. These ‘context-free’ queries allowed the neural network to learn from the data itself. That is, it needed no pre-conditioned knowledge.
In the second experiment, the neural network assessed the subjects’ responses to pre-conditioned questions. The researchers used this weighted modeling phase to determine the predictive power of the questions. The third experiment used the sequence of responses independent of question context to predict depression. This method is known as sequence modeling.
The researchers found that when using context-free modeling, the neural network was better at detecting depression from audio features than it was from text features. When the questions were pre-conditioned, audio continued to perform better than text.
The sequence modeling experiment yielded the best results. The model detects speaking style and sequences of words, and finds patterns indicative of depression. From there, it learns to detect depression in new subjects who exhibit the same patterns.
When asked if this technology is meant to replace humans, Al Hanai says not to worry. “It is sometimes difficult to understand that such AI systems will probably be used as assistive technologies for medical professionals, and will not very likely replace them.” She says that humans are needed to train and assess the accuracy of the systems.
For example, it is uncertain how well the model might perform across different languages and cultures. On the other hand, it might be personalized to an individual’s baseline speech patterns and be able to detect when a depressive episode is beginning.
The researchers hope that in the future, their model could power mobile apps that monitor a user’s text and voice for mental distress and send alerts. This could encourage an otherwise reluctant person to seek professional help. Al Hanai is also interested in using the findings from this research to develop models to detect dementia from speech and language.