Latest in AI

Showing:audioResearchersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

NVIDIA 推出 Nemotron 3 Nano Omni：支援長文本的多模態智慧模型，專為文件、語音與影片 Agent 設計★ 75
Hugging Face Blog47 days agoRelease
NVIDIA has officially launched a new lightweight multimodal model, "Nemotron 3 Nano Omni." This model is designed to deliver powerful multimodal intelligence…
Google DeepMind 推出全新改進版 Gemini 音訊模型，打造更強大的語音互動體驗★ 85
Google DeepMind Blog184 days agoRelease
Google DeepMind has announced a major upgrade to its Gemini audio models, aimed at delivering a more natural, fluid, and low-latency voice interaction…
Google 發表 Gemma 3n 預覽版：強大、高效且行動優先的端側多模態 AI 模型★ 78
Google DeepMind Blog390 days agoRelease
Google DeepMind has officially released a preview of its new open model "Gemma 3n." This is a cutting-edge open model purpose-built for mobile devices and…
Hugging Face 推出極速 Whisper 語音轉文字 Inference Endpoints 部署方案★ 75
Hugging Face Blog397 days agoNew Tool
Hugging Face recently announced a brand-new, ultra-fast optimized deployment solution for OpenAI's open-source speech recognition model Whisper on its hosted…
評估音訊推理能力：Hugging Face 推出 Big Bench Audio 基準測試★ 75
Hugging Face Blog541 days agoRelease
As multimodal large language models (such as GPT-4o, Gemini, and various open-source audio models) continue to proliferate, AI's ability to process audio has…
使用 🤗 Transformers 微調 W2V2-BERT 以進行低資源語音辨識 (ASR)★ 75
Hugging Face Blog877 days agoTutorial
This technical blog post from Hugging Face provides a detailed walkthrough of how to use the `transformers` library to fine-tune Meta's open-source W2V2-BERT…
在 Replicate 上微調 MusicGen，輕鬆生成任何風格的音樂
Replicate Blog975 days agoNew Tool
AI cloud deployment and runtime platform Replicate has announced official support for fine-tuning Meta's open-source music generation model MusicGen. This new…
Hugging Face 音訊資料集完整指南：從載入、預處理到串流處理★ 70
Hugging Face Blog1,277 days agoTutorial
With the rapid growth of voice AI (such as Whisper), efficiently handling audio datasets has become critically important. This guide from the official Hugging…
Hugging Face Datasets 推出全新音訊與電腦視覺文件指南
Hugging Face Blog1,417 days agoRelease
Hugging Face announced new official Audio and Vision documentation guides for its core open-source library `datasets`. As multimodal AI models continue to…
使用 🤗 Transformers 在 Hugging Face 中微調 Wav2Vec2 進行英文語音辨識 (ASR)★ 70
Hugging Face Blog1,920 days agoTutorial
This is a landmark technical tutorial published by the Hugging Face team in 2021, detailing how to fine-tune Meta AI's Wav2Vec2 model using the Hugging Face…