Latest in AI

Showing:vlmClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 spacex3 amazon3 national-security2 open-source2 governance2 ai-regulation2 government-policy2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

視覺語言模型（VLM）的偏好最佳化指南：使用 TRL 進行 DPO 微調★ 75
Hugging Face Blog704 days agoTutorial
As vision-language models (VLMs) are increasingly applied to multimodal tasks, how to make these models produce outputs that better align with human…
微調 Microsoft Florence-2：微軟頂尖視覺語言模型實戰指南★ 80
Hugging Face Blog720 days agoTutorial
Microsoft open-sourced Florence-2 in June 2024 — a vision-language model (VLM) based on a sequence-to-sequence architecture. Despite its compact size (the Base…
阿布達比 TII 發表 Falcon 2 11B：搭載 5 兆 Token 訓練的預訓練語言與視覺語言模型★ 75
Hugging Face Blog751 days agoRelease
The Technology Innovation Institute (TII) of Abu Dhabi has officially released a new open-source model family on Hugging Face — Falcon 2 11B. This model, with…
Google 推出 PaliGemma：結合 SigLIP 與 Gemma 的開源視覺語言模型★ 80
Hugging Face Blog761 days agoRelease
Google has officially launched PaliGemma, a powerful yet lightweight open-source Vision-Language Model (VLM). The release of PaliGemma represents a significant…
Hugging Face 推出 Idefics2：強大的 8B 開源視覺語言模型★ 80
Hugging Face Blog790 days agoRelease
Hugging Face has announced the launch of Idefics2, the next generation of its open-source Vision Language Model (VLM). With 8 billion (8B) parameters, this…
視覺語言模型（VLM）原理解析：從架構、訓練到應用指南★ 80
Hugging Face Blog794 days agoTutorial
This technical blog post published by Hugging Face provides an accessible yet thorough breakdown of the core principles and applications of Vision Language…
Hugging Face 推出 WebSight 數據集：解鎖網頁截圖直接轉換為 HTML 程式碼的能力★ 75
Hugging Face Blog821 days agoRelease
The Hugging Face official blog has published a post introducing WebSight, a brand-new open-source dataset designed to address the bottleneck that multimodal…
Hugging Face 推出 IDEFICS：開源重現 SOTA 多模態視覺語言模型 Flamingo★ 78
Hugging Face Blog1,027 days agoRelease
Hugging Face has officially launched IDEFICS (Image-supervised Decoder-Encoder-Few-shot-In-Context-Shorthand), an open-source multimodal vision-language model…
在 Habana Gaudi2 上加速視覺語言模型：BridgeTower 實作指南
Hugging Face Blog1,081 days agoTutorial
This technical blog post from Hugging Face details how to accelerate the vision-language model (VLM) "BridgeTower" on Intel's Habana Gaudi2 deep learning…
深入探討視覺語言模型 (Vision-Language Models) 的原理與架構★ 80
Hugging Face Blog1,227 days agoTutorial
This is a classic technical guide written by the Hugging Face team, designed to help developers and researchers gain a deep understanding of how…

← PreviousPage 2