Hugging Face BlogMar 18, 2024, 12:00 AMimportant 75

Hugging Face 推出 Quanto：適用於 Optimum 的全新 PyTorch 量化後端

Original: Quanto: a PyTorch quantization backend for Optimum

Hugging Face has officially introduced Quanto, a brand-new quantization library designed for PyTorch, which has been integrated as a…

Hugging Face 發表全新開源 PyTorch 量化工具庫 Quanto，現已整合至 Optimum 生態系。Quanto 支援權重與激活值的量化（包括 int4、int8 與 float8），且具備跨平台相容性，可在 CPU、GPU 及 Apple Silicon (MPS) 上運行。開發者只需幾行程式碼即可對 Transformers 和 Diffusers 模型進行訓練後量化（PTQ）或量化感知訓練（QAT）。

Hugging Face has officially introduced Quanto, a brand-new quantization library designed for PyTorch, which has been integrated as a backend into the Hugging Face Optimum ecosystem. As large language models (LLMs) and diffusion models grow ever larger in size, the challenge of running these models on resource-constrained consumer hardware has become critical. Quanto was created precisely to address this pain point.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

open-source optimum transformers #quantization #pytorch #optimum #model-compression #llm

Summaries are AI-generated; the original article is authoritative.