Latest in AI

Showing:trlResearchersClear ×

🔥 Trending today

anthropic4 open-source3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Shipping a Trillion Parameters With a Hub Bucket: Delta Weight Sync in TRL
Hugging Face Blog18 days agoTutorial
Based on the title, this Hugging Face Blog post focuses on Delta Weight Sync in TRL. It likely discusses moving or synchronizing weight differences at very large model scale using a Hub bucket-related workflow. Without the full article, implementation details, benchmarks, APIs, and stability claims cannot be confirmed.
使用 RapidFire AI 讓 Hugging Face TRL 微調速度提升 20 倍★ 80
Hugging Face Blog205 days agoRelease
The Hugging Face official blog has announced a collaboration with RapidFire AI, bringing a revolutionary performance improvement to its popular TRL…
讓 GPU 毫無閒置：利用 TRL 中協同部署的 vLLM 解鎖高效能強化學習訓練★ 85
Hugging Face Blog376 days agoRelease
In the reinforcement learning from human feedback (RLHF) training process for large language models — whether PPO or the recently popular GRPO — there are…
Hugging Face 推出 RLOO 演算法：降低記憶體消耗，讓強化學習重回 RLHF 主流★ 80
Hugging Face Blog732 days agoRelease
In recent years, methods such as Direct Preference Optimization (DPO) have become mainstream for large language model (LLM) alignment, as they eliminate the…
使用直接偏好最佳化 (DPO) 方法對 LLM 進行偏好微調 (Preference Tuning)★ 80
Hugging Face Blog878 days agoTutorial
This technical blog post from Hugging Face takes an in-depth look at the latest techniques in "preference tuning," with a particular focus on **Direct…