--- license: apache-2.0 language: - es - sk tags: - audio - parkinsons - speech - health - classification - contrastive-learning - attention - adain - wavelets - self-supervised datasets: - ewa-db - pc-gita model-index: - name: BDHPD results: - task: type: audio-classification name: Parkinson's Disease Detection dataset: name: EWA-DB (Slovak) type: ewa-db metrics: - type: f1 value: 69.03 - type: accuracy value: 84.72 - type: sensitivity value: 56.52 - type: specificity value: 88.56 - task: type: audio-classification name: Parkinson's Disease Detection dataset: name: PC-GITA (Spanish) type: pc-gita metrics: - type: f1 value: 90.83 - type: accuracy value: 90.83 - type: sensitivity value: 93.33 - type: specificity value: 88.33 --- # BDHPD: Bilingual Dual-Head Architecture for Parkinson's Disease Detection from Speech This model implements **BDHPD**, a deep neural network designed to detect Parkinson's Disease (PD) from speech signals, with bilingual support for Slovak and Spanish datasets. ## Model Description BDHPD combines several modern audio processing techniques: - **Self-supervised learning (SSL)** with models like `microsoft/wavlm-base` - **Wavelet-based spectrogram features** - **Adaptive Instance Normalization (AdaIN)** for domain adaptation - **Convolutional Bottleneck Layers** for feature recalibration - **Dual-head classification architecture** to handle different speech types (e.g., diadochokinetic and continuous) - **Contrastive learning** for embedding space refinement - **Attention pooling** for better sequence summarization The architecture supports bilingual inputs and has been evaluated on **EWA-DB** (Slovak) and **PC-GITA** (Spanish). ## Intended Use - **Research** in pathological speech detection - **Benchmarking** bilingual speech-based PD detection models - **Development** of real-world diagnostic support tools in healthcare ## Training Training was performed using: - AdamW optimizer - Linear learning rate scheduling with warmup - Binary cross-entropy loss for classification - Contrastive loss via `pytorch-metric-learning` - 20 epochs with early stopping - Balanced batch sampling for both datasets ## How to Use You can find all information on the GitHub repository: [BDHPD GitHub](https://github.com/MorenoLaQuatra/BDHPD) ## Datasets - [**EWA-DB**](https://zenodo.org/records/10952480): Slovak pathological and healthy speech - [**PC-GITA**](https://aclanthology.org/L14-1549/): Spanish pathological speech ## Limitations - The model is only trained on Slovak and Spanish speakers; cross-lingual generalization outside these languages is untested. - Sensitive to audio quality-ensure audio is preprocessed with proper VAD and dereverberation. - Should not be used as a standalone diagnostic tool. ## Citation If you use this model or find useful this research work, please cite the following paper: ```bibtex @inproceedings{laquatra2025bilingual, title={Bilingual Dual-Head Deep Model for Parkinson's Disease Detection from Speech}, author={La Quatra, Moreno and Orozco-Arroyave, Juan Rafael and Siniscalchi, Marco Sabato}, booktitle={ICASSP 2025 - IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, year={2025} } ```