🚀 Exciting News: Two New Open-Source Bambara ASR Models!

🚀 Exciting News: Two New Open-Source Bambara ASR Models!

We’re thrilled to announce the release of two lightweight Bambara ASR models from our early fine-tuning experiments! These models achieve near-SOTA performance for open-source Bambara ASR (as of Feb 2025) while remaining efficient enough for real-world deployment.

🧐 Why Does This Matter?

Bambara is a severely under-resourced language. While ASR research has made rapid progress in major languages, open-source models for Bambara remain rare. Our goal is to share our fine-tuning experiments to help push research forward and support the development of ASR for African languages.


🎙️ Model 1: Parakeet-114M TDT-CTC

📌 Hugging Face: RobotsMali/soloni-114m-tdt-ctc

🧠 Model Size: ~114M parameters

📊 WER (TDT Decoder): 66%

📊 WER (CTC Decoder): 40.6%

💡 Key Takeaway: Despite being small, the CTC branch outperforms previous open-source models.


🎧 Model 2: QuartzNet-15×5

📌 Hugging Face: RobotsMali/stt-bm-quartznet15x5

🧠 Model Size: ~19M parameters

📊 WER: 46.5%

💡 Key Takeaway: This lightweight model is optimized for real-time ASR in low-resource environments.

🛠️ Open-Sourcing for Research & Community Feedback

These models are the result of early fine-tuning experiments, not a finished product. We’re making them available for research and community contributions.

🔗 Code & Configs: GitHub

📑 Experimental Report: W&B

🤝 Final Thoughts

We hope this contributes to Bambara ASR research and encourages more collaboration. 🚀 Stay tuned for future updates!

📩 Want to contribute or learn more? Reach out via RobotsMali or join the discussion on GitHub!

Leave A Comment

FrançaisfrFrançaisFrançais