Hugging Face 推出 Quanto:適用於 Optimum 的全新 PyTorch 量化後端
Original: Quanto: a PyTorch quantization backend for Optimum
Hugging Face has officially introduced Quanto, a brand-new quantization library designed for PyTorch, which has been integrated as a…
Hugging Face 發表全新開源 PyTorch 量化工具庫 Quanto,現已整合至 Optimum 生態系。Quanto 支援權重與激活值的量化(包括 int4、int8 與 float8),且具備跨平台相容性,可在 CPU、GPU 及 Apple Silicon (MPS) 上運行。開發者只需幾行程式碼即可對 Transformers 和 Diffusers 模型進行訓練後量化(PTQ)或量化感知訓練(QAT)。
Hugging Face has officially introduced Quanto, a brand-new quantization library designed for PyTorch, which has been integrated as a backend into the Hugging Face Optimum ecosystem. As large language models (LLMs) and diffusion models grow ever larger in size, the challenge of running these models on resource-constrained consumer hardware has become critical. Quanto was created precisely to address this pain point.
Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.
See Pro plans →Want the original English / full article?
Read on Hugging Face Blog →Summaries are AI-generated; the original article is authoritative.