Google 推出 SigLIP 2:更強大的多語言視覺語言編碼器
Original: SigLIP 2: A better multilingual vision language encoder
Google has officially launched SigLIP 2, a major upgrade to its widely popular SigLIP (Sigmoid Loss for Language-Image Pre-training)…
Google 與 Hugging Face 聯合發表 SigLIP 2 視覺語言編碼器。作為經典 SigLIP 的升級版,SigLIP 2 引入了動態解析度、自監督學習(SSL)輔助任務與更強的多語言支援。它在零樣本分類、圖文檢索及定位等任務上表現優異,並提供多種尺寸的模型,非常適合用作新一代多模態大模型(VLM)的視覺骨幹網路(Vision Backbone)。
Google has officially launched SigLIP 2, a major upgrade to its widely popular SigLIP (Sigmoid Loss for Language-Image Pre-training) vision-language encoder. SigLIP 2 is designed to provide stronger visual and linguistic representation capabilities for multimodal tasks.
Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.
See Pro plans →Want the original English / full article?
Read on Hugging Face Blog →Summaries are AI-generated; the original article is authoritative.