AI-ML
Transformers v5.3.0
RESUMEN
New Model additions EuroBERT EuroBERT is a multilingual encoder model based on a refreshed transformer architecture, akin to Llama but with bidirection
Descripción Detallada
New Model additions EuroBERT EuroBERT is a multilingual encoder model based on a refreshed transformer architecture, akin to Llama but with bidirectional attention. It supports a mixture of European and widely spoken languages, with sequences of up to 8192 tokens. Links: Documentation | Paper | Blog Post * Add eurobert by @ArthurZucker in #39455 VibeVoice ASR VibeVoice ASR is an automatic speech recognition model from Microsoft that combines acoustic and semantic audio tokenizers with a causal language model for robust speech-to-text transcription. The model uses VibeVoice's acoustic and semantic tokenizers that process audio at 24kHz, paired with a Qwen2-based language decoder for generating transcriptions. It can process up to 60 minutes of continuous audio input, supports customized hotwords, performs joint ASR/diarization/timestamping, and handles over 50 languages with code-switching support. Links: [Documentation](
Transformers v5.3.0 añade EuroBERT y VibeVoice ASR, pero incluye cambios que rompen compatibilidad.
- Se añade EuroBERT, un modelo multilingüe con atención bidireccional y soporte para hasta 8192 tokens.
- Se introduce VibeVoice ASR, un modelo de reconocimiento de voz con soporte para más de 50 idiomas y funciones avanzadas.
A quién le importa
Solo si usas versiones anteriores de Transformers y necesitas compatibilidad.
Generado por IA · puede contener errores
Releases Relacionados
AI-ML
Transformers v5.2.0
## New Model additions ### VoxtralRealtime <img width="1920" height="1080" alt="image" src="https://github.com/user-attachments/assets/80e37670-6d70-402b-8c8e-ccfb8c32df2d" /> VoxtralRealtime is a streaming speech-to-text model from [Mistral AI](https://mistral.ai), designed for real-time a
AI-ML
Transformers v5.1.0
## New Model additions ### EXAONE-MoE <img width="2278" height="1142" alt="image" src="https://github.com/user-attachments/assets/0c3d5341-0483-49c3-8467-f9784ec94b37" /> K-EXAONE is a large-scale multilingual language model developed by LG AI Research. Built using a Mixture-of-Experts arch