Parkinson's disease: detecting changes in speech patterns

Japanese researchers have found that natural language processing might be an effective tool to analyze specific speech changes of patients with Parkinson's disease (PD), allowing for more effective diagnosis.

The research group headed by Prof. Masahisa Katsuno, Department of Neurology, Nagoya University Graduate School of Medicine, Principal Investigator, and Visiting Researcher Katsunori Yokoi, Department of Neurology, National Center for Geriatrics and Gerontology, in collaboration with Associate Professor Yurie Iribe, School of Information Science and Technology, Aichi Prefectural University, and Professor Norihide Kitaoka, Department of Computer Science and Engineering, Toyohashi University of Technology, conducted research on linguistic alteration in patients with Parkinson's disease. Using natural language processing technology, they investigated the characteristics of PD patients' conversations and the possibility of diagnosing Parkinson's disease from free conversational texts. This work was published online in Parkinsonism & Related Disorders.

The results of this study suggest that even in the absence of cognitive decline, the conversations of PD patients are altered from those of healthy subjects
Katsunori Yokoi et al.

Patients with PD experience a variety of speech-related problems, including impaired speech production and language impairment. To elucidate the pathophysiological mechanisms of language changes in PD, Professor Katsuno and his research team used natural language processing technology to compare the speech of cognitively normal patients with that of healthy controls. They recorded the conversations of 53 cognitively normal PD patients and 53 healthy controls and used natural language processing to evaluate the spontaneous conversational text. Then, the researchers used machine learning algorithms to identify the characteristics of each group's conversation. For this analysis, 37 features focused on part-of-speech and syntactic complexity were used as evaluation items. Using a support vector machine (SVM), they used a 10-part cross-validation method to narrow down which of these features were effective in identifying PD patients' conversations, and also tested the identification rate of each group.

The analysis revealed, first, that PD patients had fewer parts-of-speech per sentence than the healthy control group. Second, the team found that PD patients had a higher percentage of verbs in total conversational sentences, a higher variance, a measure of data variability, for case particles, a higher percentage of verbs per sentence, and a lower percentage of common nouns, proper nouns, and fillers per sentence compared to the healthy control group. When the researchers attempted to identify PD patients or healthy controls based on these conversational changes, the identification rate was more than 80%.

The results of this study suggest that even in the absence of cognitive decline, the conversations of PD patients are altered from those of healthy subjects. The results also suggest that natural language processing can be used to understand the characteristics of PD patients' conversations and to identify PD patients.

Parkinson's disease is the second most common neurodegenerative disease after Alzheimer's disease. It is a relatively common disease, and its prevalence increases with age. It is characterized by motor symptoms, such as slow movements, muscle rigidity, and resting tremor, and non-motor symptoms, such as cognitive impairment, mental impairment, sleep disturbance, autonomic dysfunction, and sensory disturbance.
Communicative changes are also common in PD. These are attributed to a variety of factors, including speech, prosodic changes, pronunciation and articulation changes, language changes, discourse management and pragmatics, and psychosocial influences. Studies have shown that more than 90% of PD patients experience some form of language impairment.

The purpose of this study was to analyze conversational sentences using natural language processing to clarify the pathophysiological mechanisms of speech transformation in PD patients without cognitive dysfunction.

A total of 73 PD patients and 54 healthy subjects were recruited between April 2012 and March 2020. Of these, 17 PD patients and 1 healthy subject were excluded due to insufficient data, as were 3 patients who were ultimately diagnosed with Lewy body dementia. As a result, 53 PD patients (24 males and 39 females) and 53 healthy subjects (24 males and 39 females) were included in the analysis. No significant differences were found between the groups with respect to age, sex, years of education, and MoCA-J scores.

There were no significant differences between PD and healthy subjects on the verbal fluency task, saying as many words beginning with "ka" as possible in one minute, and the semantic fluency task, saying as many animal names as possible in one minute, but the number of parts of speech was significantly lower in the PD group than in the healthy control group. There were no significant group differences on the number of sentences.

Next, the researchers conducted four trials using the Wrapper method to select features to discriminate between the PD and healthy subject groups. Ten Crossover results showed that the third trial had the highest F value. Sensitivity, specificity, positive predictive value, and negative predictive value all exceeded 0.83 for the third trial.

In the analysis, six features were selected as important cues to distinguish PD patients from healthy controls: verb rate, case particle variance, common nouns, proper nouns, verbs, and fillers per sentence. Analysis of the selected features revealed significant differences between the PD group and the healthy group on all six items.

Natural language processing of conversations of Parkinson's disease patients

Image source: Nagoya University

The results of this study indicate that there are differences in the content of conversations between PD patients without cognitive impairment and healthy controls, the researchers write. Specifically, it was found that compared to healthy subjects, PD patients' conversations consisted of fewer parts of speech and shorter sentences spoken in a single sentence, and more dispersion of verbs and case particles and fewer nouns and fillers (see Figure above). By applying this conversational variation, the team confirmed that the support vector machine can discriminate between PD and healthy controls with more than 80% accuracy. This result suggests the possibility of language analysis using natural language processing, which could be used to diagnose PD.

In the future, the researchers plan to use natural language processing to analyze the conversations of PD patients with cognitive decline and patients with neurodegenerative diseases other than PD, especially Alzheimer's disease.

Source: Nagoya University

15.05.2023