Hugging Face 發表 TRL v1.0:專為後訓練(Post-Training)打造的開源庫,邁向 API 穩定與高效對齊新里程碑
Original: TRL v1.0: Post-Training Library Built to Move with the Field
Hugging Face has officially announced the release of TRL (Transformer Reinforcement Learning) v1.0. This is a major milestone, marking…
Hugging Face 旗下熱門的 Transformer 強化學習庫 TRL 正式迎來 v1.0 版本。此版本確立了穩定的 API 設計,並將定位聚焦於「後訓練(Post-Training)」生態系。TRL v1.0 整合了監督微調(SFT)、直接偏好優化(DPO)以及因 DeepSeek 爆紅的群體相對策略優化(GRPO)等主流對齊技術,旨在為開發者提供一個能與快速變革的 AI 領域並肩同行的標準化工具。
Hugging Face has officially announced the release of TRL (Transformer Reinforcement Learning) v1.0. This is a major milestone, marking TRL's transformation from an experimental research tool into a production-ready, API-stable core open-source library for post-training.
Free shows the 3-line summary; Pro unlocks the full deep summary (~300 words) so you never have to click through.
See Pro plans →Want the original English / full article?
Read on Hugging Face Blog →Summaries are AI-generated; the original article is authoritative.