Latest in AI

Showing:asrResearchersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech
Hugging Face Blog5 days agoBenchmark
Code-switching—where bilingual speakers blend two languages in a single utterance—is common in markets like Taiwan, Singapore, and India, yet most ASR benchmarks focus on monolingual audio. ServiceNow AI evaluates frontier speech recognition models specifically on this mixed-language scenario. The findings help enterprise teams make informed ASR model choices when deploying voice agents for multilingual customer-facing applications.
Voxtral★ 78
Mistral AI News6 days agoRelease
Mistral AI introduces Voxtral, a speech understanding model family with 24B and 3B variants under Apache 2.0. The models support long-context transcription, audio Q&A, summarization, multilingual detection, and function calling from voice. Mistral says Voxtral is competitive across transcription and audio understanding benchmarks, with API access starting at $0.001 per minute and local downloads available on Hugging Face.
How to Fine-Tune Nemotron 3.5 ASR for Your Language, Domain, or Accent
Hugging Face Blog10 days agoTutorial
This Hugging Face Blog post appears to be a practical tutorial for fine-tuning NVIDIA Nemotron 3.5 ASR. Based on the title, it focuses on adapting speech recognition to a target language, specialized domain, or accent. The original text was not provided, so implementation details, datasets, commands, metrics, and hardware requirements cannot be confirmed.
Hugging Face 為 Open ASR 排行榜引入「防刷榜機制」，使用私有測試數據打擊 Benchmaxxer★ 75
Hugging Face Blog39 days agoRelease
Hugging Face has recently made a major update to its popular Open ASR (Automatic Speech Recognition) leaderboard, aimed at combating the increasingly serious…
Hugging Face 推出 Open ASR Leaderboard 新賽道：聚焦多語言與長音訊語音辨識趨勢★ 75
Hugging Face Blog205 days agoRelease
Hugging Face recently made a major upgrade to its flagship "Open ASR Leaderboard," officially launching two brand-new evaluation tracks: "Multilingual" and…
使用 Hugging Face Inference Endpoints 實現高效能 ASR、語者辨識與投機解碼★ 75
Hugging Face Blog774 days agoTutorial
This technical blog post from Hugging Face introduces how to build a powerful and efficient speech processing system using Hugging Face Inference Endpoints — a…
使用 🤗 Transformers 微調 W2V2-BERT 以進行低資源語音辨識 (ASR)★ 75
Hugging Face Blog877 days agoTutorial
This technical blog post from Hugging Face provides a detailed walkthrough of how to use the `transformers` library to fine-tune Meta's open-source W2V2-BERT…
微調 MMS Adapter 模型：為低資源語言打造專屬語音辨識 (ASR)★ 70
Hugging Face Blog1,091 days agoTutorial
Meta's MMS (Massively Multilingual Speech) project, released in 2023, extends speech technology to over 1,000 languages, covering automatic speech recognition…
Microsoft SpeechT5 登陸 Hugging Face：語音合成、辨識與轉換的多功能統一模型★ 75
Hugging Face Blog1,222 days agoRelease
Microsoft's SpeechT5 model has been officially integrated into Hugging Face's Transformers library. This represents a significant advancement in the field of…
使用 🤗 Transformers 微調 Whisper 進行多語言語音辨識 (ASR)★ 80
Hugging Face Blog1,319 days agoTutorial
OpenAI's Whisper is a powerful automatic speech recognition (ASR) model. While its zero-shot capabilities are impressive, there remains significant room for…
在 🤗 Transformers 中使用 Wav2Vec2 處理超長音檔的自動語音辨識 (ASR)
Hugging Face Blog1,594 days agoTutorial
In the field of automatic speech recognition (ASR), Wav2Vec2 is a revolutionary model, but it faces a significant challenge when processing long audio files…
在 🤗 Transformers 中使用 n-gram 提升 Wav2Vec2 語音識別效能
Hugging Face Blog1,614 days agoTutorial
This technical blog post from Hugging Face introduces how combining n-gram language models (LMs) can significantly improve the performance of Wav2Vec2…
使用 🤗 Transformers 微調 XLSR-Wav2Vec2 以進行低資源語音辨識 (ASR)
Hugging Face Blog1,672 days agoTutorial
Automatic speech recognition (ASR) has achieved remarkable success for resource-rich languages such as English and standard Mandarin, but building…
使用 🤗 Transformers 在 Hugging Face 中微調 Wav2Vec2 進行英文語音辨識 (ASR)★ 70
Hugging Face Blog1,920 days agoTutorial
This is a landmark technical tutorial published by the Hugging Face team in 2021, detailing how to fine-tune Meta AI's Wav2Vec2 model using the Hugging Face…