Hugging Face BlogFeb 11, 2022, 12:00 AMimportant 70

使用 🤗 Transformers 微調 ViT 進行影像分類教學

Original: Fine-Tune ViT for Image Classification with 🤗 Transformers

This is an official tutorial article from Hugging Face that guides developers on how to fine-tune a Vision Transformer (ViT) model for…

本文為 Hugging Face 官方教學,詳細介紹如何使用 `transformers` 與 `datasets` 函式庫微調 Vision Transformer (ViT) 模型。內容涵蓋從載入 Beans 資料集、使用影像處理器進行資料前處理、設定 `Trainer` API 進行訓練,到最後將微調後的模型上傳至 Hugging Face Hub 的完整流程,是電腦視覺開發者的必讀入門指南。

This is an official tutorial article from Hugging Face that guides developers on how to fine-tune a Vision Transformer (ViT) model for image classification tasks using the Hugging Face Transformers library. Following the tremendous success of the Transformer architecture in natural language processing (NLP), Google's ViT brought that success into the computer vision (CV) domain, demonstrating performance that surpasses traditional convolutional neural networks (CNNs) on multiple benchmarks.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

Summaries are AI-generated; the original article is authoritative.