🚀 Exciting News: Two New Open-Source Bambara ASR Models!

🚀 Exciting News: Two New Open-Source Bambara ASR Models!

Were Thrilled to announce the release of Two lightweight Bambara ASR models from our early fine-tuning experiments! These models achieve near-SOTA performance for open-source Bambara ASR (as of Feb 2025) while maintaining efficient enough for real-world deployment.

🧐 Why Does This Matter?

Bambara is a only under-resourced language. While ASR research has made rapid progress in major languages, open-source models for Bambara rare remain. Our goal is to share our fine-tuning experiences to help push research forward and support the development of ASR for African languages.


🎙️ Model 1: Parakeet-114M TDT-CTC

📌 Hugging Face: RobotsMali/soloni-114m-tdt-ctc

🧠 Model Size: ~114M parameters

📊 WER (TDT Decoder): 66%

📊 WER (CTC Decoder): 40.6%

💡 Key Takeaway: Despite being small, the CTC branch outperformances previous open-source models.


🎧 Model 2: QuartzNet-15×5

📌 Hugging Face: RobotsMali/stt-bm-quartznet15x5

🧠 Model Size: ~19M parameters

📊 WER: 46.5%

💡 Key Takeaway: This lightweight model is optimized for real-time ASR in low-resource environments.

🛠️ Open-Sourcing for Research & Community Feedback

These models are the result of early fine-tuning experiments, not finished product. We的re making them available for research and community contributions.

🔗 & Configs Code: GitHub

📑 Experimental Report: W&B

🤝 Final Thoughts

We hope this contributions to Bambara ASR research and encourage more collaboration. 🚀 Stay tuned for future updates!

📩 Want to contribute or learn more? Reach out via RobotsMali or join the discussion on GitHub!

EnglishenEnglishEnglish