Hibki Zero, open source real-time and multilingual speech translation model

build on the architecture of the original Hibiki but introduce a new training method based on RL. While Hibiki relied on complex heuristics to create aligned synthetic data, Hibiki-Zero only requires sentence-level alignment and learns word-level alignments through RL. This simplifies the synthetic data creation process, decreases the latency, and seamlessly scales to multiple languages.

story on X: https://x.com/kyutai_labs/status/2022007408898511113
github: https://github.com/kyutai-labs/hibiki-zero
arvix: https://arxiv.org/abs/2602.11072

Hibki Zero, open source real-time and multilingual speech translation model

Links