Hugging Face BlogFeb 21, 2025, 12:00 AMimportant 80

Google 推出 SigLIP 2:更強大的多語言視覺語言編碼器

Original: SigLIP 2: A better multilingual vision language encoder

Google has officially launched SigLIP 2, a major upgrade to its widely popular SigLIP (Sigmoid Loss for Language-Image Pre-training)…

Google 與 Hugging Face 聯合發表 SigLIP 2 視覺語言編碼器。作為經典 SigLIP 的升級版,SigLIP 2 引入了動態解析度、自監督學習(SSL)輔助任務與更強的多語言支援。它在零樣本分類、圖文檢索及定位等任務上表現優異,並提供多種尺寸的模型,非常適合用作新一代多模態大模型(VLM)的視覺骨幹網路(Vision Backbone)。

Google has officially launched SigLIP 2, a major upgrade to its widely popular SigLIP (Sigmoid Loss for Language-Image Pre-training) vision-language encoder. SigLIP 2 is designed to provide stronger visual and linguistic representation capabilities for multimodal tasks.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

Summaries are AI-generated; the original article is authoritative.