• The research facilitates the diagnose process of the disease, for both doctors and patients
  • It focused the PhD Thesis of researcher Paula López, from atlanTTic research group of Multimedia Technologies

Developing a system able to detect through the voice whether or not a person suffers depression. This is one of the main goals with which Multimedia Technologies Group (MTG) are working at the University of Vigo, where the researcher Paula Lopez focused a prominent part of his doctoral thesis, a thorough analysis aimed at obtaining information on how technology can help -through a simple analysis of the voice- to obtain information about the emotional state of a person.

“This system would be very useful to raise awareness about this disease becoming more common, as well as the possibility of facilitating the diagnostic process for both the doctor and the patient,” says Lopez, postdoctoral researcher in the MTG, a team specialized in speech technology, image processing and machine vision, led by Professor Carmen Garcia Mateo -thesis co-director with Laura Docio-, and considered since 2007 as Competitive Reference Group by Galicia Government.

“Getting information about the emotional state through voice technologies is a research area that is receiving more attention” remarks López, which explains that this is due to the many practical applications this type of work have for conducting market analysis, game design, advance in human-machine communication interfaces and, “though less addressed” the study of diseases and mental disorders.

Also valid for Parkinson and Alzheimer diseases

The study of depression levels in a person is carried out using specific databases, recorded in hospitals with real patients who are asked to make tests such as reading a paragraph from a book, evoking moments of joy or sadness in his childhood and similar tasks. “Then, they complete on their own a form to assess the level of his depressive state and, repeating this process with some frequency, we can observe the response the patient is taking to a particular treatment” says the researcher, who stresses that the main problem that exists is due to the sensitivity of the information contained in these databases, that’s why it is difficult to find public material to work with. “Therefore, nowadays, it is still difficult to know what are the most important characteristics of the voice” Lopez said, who calls attention to Galician research group should ideally compile their own recordings with a group of patients, “with the medical support and / or the opinion of expert neurologists in the disease”, but this is not always easy because of the rejection that can show both, doctors and patients. “In fact, the best option usually is to work with patients’ associations, because people who come to these centers are usually predisposed to this kind of activities” emphasizes the researcher.

Due to the importance of the issue, from Multimedia Technologies Group are expanding this research to automatic detection of diseases like Parkinson’s or Alzheimer’s. In the case of Parkinson’s, as it is a disease that causes slurred speech even in its early stages, there is already strong evidence to suggest what voice and speech characteristics are the most relevant, “because in this case the prosody of the speaker is clearly impaired by illness”. In this case, the purpose is not only to help the diagnosis, but also improve the process of monitoring the evolution of the disease “in order to verify whether treatments and therapies are working”, points out the author of the thesis.

Automatic voice detection methods

In parallel to the work carried out in line detection of emotional state, Paula López focused the other part of his doctoral thesis on a detailed analysis of the best techniques to detect fully automatically who is speaking at each moment in an audio recording, which is known as segmentation speakers. “Every day increases the number of multimedia content which must be labeled for access, as the only way to locate them” explains the author of the thesis, while stressing that there is much left to be done in order to exploit fully the potential of this technology, since existing methods are not all perfect it should, although according to the researcher, “still” the amount of hours that are saved using these automatic techniques is immense.

Being able to automatically detect the identity of the speakers present in a recording or what they say, it would open the possibility that Internet search engines allow searching based on audio, “thus we could index multimedia contents in which somebody talks about a certain topic or where a speaker that interests us appears”, remarks López, whose studies have already been published in international journals.



Source (Spanish): Duvi