Hugging Face BlogFeb 20, 2025, 12:00 AMimportant 80

SmolVLM2：將影片理解能力帶到每一台裝置的輕量級視覺語言模型

Original: SmolVLM2: Bringing Video Understanding to Every Device

Hugging Face has introduced SmolVLM2, the latest addition to its Smol family of lightweight models. SmolVLM2 is designed to bring advanced…

Hugging Face 正式發布 SmolVLM2 系列模型，專為手機與筆電等個人裝置設計。此版本最大亮點是引入了強大的「影片理解」與「多圖處理」能力，其中 2.2B 旗艦版本在保持極低運算資源消耗的同時，能在多項視覺與影片基準測試中媲美更大尺寸的模型。模型完全開源並採用 Apache 2.0 授權，極具實用價值。

Hugging Face has introduced SmolVLM2, the latest addition to its Smol family of lightweight models. SmolVLM2 is designed to bring advanced vision-language (VLM) and video understanding capabilities to consumer-grade hardware — such as smartphones, laptops, and embedded devices — enabling fully local, privacy-safe computation.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

open-source transformers #vlm #video-understanding #on-device #edge-ai #open-source

Summaries are AI-generated; the original article is authoritative.