Hugging Face BlogJun 22, 2022, 12:00 AM

使用 Hugging Face Optimum 將 Transformers 模型轉換為 ONNX 格式

Original: Convert Transformers to ONNX with Hugging Face Optimum

When deploying Transformer models in production, latency and throughput are typically the key factors determining the quality of the user…

本文介紹 Hugging Face 推出的一站式硬體優化工具包 Optimum,展示如何將 Transformers 模型轉換為 ONNX 格式。透過簡單的 optimum-cli 命令行工具或 Python API,開發者即可完成轉換,並利用 ONNX Runtime 在各種硬體上實現顯著的推理加速與量化優化,解決過去手動轉換繁瑣且易出錯的痛點。

When deploying Transformer models in production, latency and throughput are typically the key factors determining the quality of the user experience. ONNX (Open Neural Network Exchange), as an open model format combined with ONNX Runtime, allows models to achieve peak inference performance across different hardware architectures (such as CPUs and GPUs). However, the process of manually converting PyTorch or TensorFlow Transformer models to ONNX format has traditionally been tedious and prone to issues with unsupported operators or misconfiguration.

Full summary

Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.

See Pro plans →

Want the original English / full article?

Read on Hugging Face Blog →

Summaries are AI-generated; the original article is authoritative.