Hugging Face BlogJan 24, 2025, 12:00 AMimportant 80

Hugging Face 輕量級 Agent 框架 smolagents 正式支援視覺語言模型 (VLM)！

Original: We now support VLMs in smolagents!

On January 24, 2025, Hugging Face announced that smolagents — its open-source library designed for building lightweight, high-performance…

Hugging Face 旗下的輕量級 Agent 開源庫 smolagents 迎來重大更新，正式支援視覺語言模型（VLM）。開發者現在可以讓 Agent 接收並處理影像輸入，適用於網頁視覺導航、圖表分析及多模態任務。此更新大幅擴展了程式碼 Agent（Code Agent）的應用場景，使其能「看見」並理解真實世界的視覺資訊。

On January 24, 2025, Hugging Face announced that smolagents — its open-source library designed for building lightweight, high-performance AI agents — now officially supports Vision-Language Models (VLMs). This update endows smolagents with "visual" capabilities, enabling developers to build intelligent agents that can understand images, charts, and even webpage screenshots.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

open-source gpt claude llama smolagents #agents #vlm #multimodal #computer-vision #python

Summaries are AI-generated; the original article is authoritative.