+ Pharmaceuticals
Patient Daily | Mar 7, 2026

AI system combines voice, gait and handwriting analysis for Parkinson's detection

A new artificial intelligence (AI) system has demonstrated the ability to detect Parkinson’s disease by analyzing three different types of digital biomarkers: voice patterns, walking gait, and handwriting. The research, published in Frontiers in Digital Health, details how combining these modalities into a single explainable AI model could improve early screening for Parkinson’s.

Parkinson’s disease is a progressive neurological disorder that affects movement and can also cause speech and cognitive problems. Diagnosing it typically relies on clinical evaluation, which can sometimes lead to misdiagnosis or missed cases, especially at early stages.

The study notes that while AI-based analysis of speech, gait, or handwriting individually has achieved high accuracy rates—up to 99% for speech analysis, 97% for gait analytics, and nearly 98% for handwriting—each method faces challenges when used alone. For example, speech recognition can be affected by accent or background noise; gait detection depends on sensor quality; and handwriting analysis often uses data from controlled settings rather than real-world environments.

To address these limitations, the researchers developed a multimodal deep learning framework that processes data from all three sources. The system extracts features from each modality using neural networks: EfficientNet-B0 analyzes log-Mel spectrograms from speech; temporal convolutional networks process vertical ground reaction force data from wearable sensors to assess gait; and ResNet-50 evaluates spiral drawings to identify handwriting abnormalities linked to tremor.

The combined features are then classified using an XGBoost model. To ensure transparency in decision-making—a common criticism of many AI systems—the model incorporates explainability tools such as SHapley Additive exPlanations (SHAP), Gradient-weighted Class Activation Mapping (Grad-CAM), and Integrated Gradients. These tools allow clinicians to see which aspects of the input data influenced the system’s predictions.

Testing was conducted using publicly available datasets: a spiral handwriting set with over 3,200 samples, a speech dataset with around 73 subjects, and a gait dataset with about 168 subjects. Fivefold cross-validation was used to assess performance reliability.

Results showed that the trimodal fusion model reached an overall accuracy of 92%, outperforming models based on only one type of input—91% for handwriting alone, 90% for gait alone, and 74% for speech alone. The model also achieved balanced sensitivity (90%) and specificity (89%), meaning it effectively distinguished between Parkinson’s patients and healthy individuals without generating excessive false positives or negatives.

"Future studies should involve neurologists and longitudinal analyses to establish the clinical validity of this framework, build trust, and ensure regulatory readiness," according to the authors. "Lighter, deployment-oriented versions of the model, along with more adaptive multimodal fusion strategies, may further enhance real-world applicability."

Despite promising results on benchmark datasets, the researchers caution that their system has not yet been validated in prospective clinical trials or tested under conditions where some types of data might be missing. They also note that generalizability remains an issue for certain modalities like speech and gait when applied outside controlled environments.

The study did not include other potential biomarkers such as cerebrospinal fluid tests or neuroimaging but focused solely on voice recordings, walking patterns measured by sensors, and digitized handwriting samples.

The authors suggest future work should include clinical involvement to validate explainability claims and explore adaptive fusion mechanisms capable of handling incomplete or variable-quality data in real-world scenarios.

Organizations in this story