Latest in AI

Showing:dpoClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Direct Preference Optimization Beyond Chatbots
Hugging Face Blog11 days agoTutorial
Based only on the title, this Hugging Face Blog post appears to discuss Direct Preference Optimization outside conventional chatbot use cases. It may frame DPO as a broader preference-alignment method for model outputs, workflows, or non-conversational AI systems. Without the full article, specific claims about experiments, datasets, models, or implementation details cannot be verified.
Hugging Face 發表 TRL v1.0：專為後訓練（Post-Training）打造的開源庫，邁向 API 穩定與高效對齊新里程碑★ 85
Hugging Face Blog75 days agoRelease
Hugging Face has officially announced the release of TRL (Transformer Reinforcement Learning) v1.0. This is a major milestone, marking TRL's transformation…
使用 RapidFire AI 讓 Hugging Face TRL 微調速度提升 20 倍★ 80
Hugging Face Blog205 days agoRelease
The Hugging Face official blog has announced a collaboration with RapidFire AI, bringing a revolutionary performance improvement to its popular TRL…
Hugging Face TRL 支援視覺語言模型 (VLM) 對齊：輕鬆實現多模態 DPO 與 ORPO 訓練★ 80
Hugging Face Blog311 days agoRelease
Hugging Face's TRL (Transformer Reinforcement Learning) is a popular open-source library specifically designed for aligning language models (LLMs). In its…
Hugging Face 社群推出用於文字生成圖像的開源偏好資料集 (Open Preference Dataset)★ 75
Hugging Face Blog552 days agoRelease
### Introduction: An Important Piece of the Open-Source Image Generation Puzzle As text-to-image (T2I) technology advances rapidly, ensuring that AI-generated…
視覺語言模型（VLM）的偏好最佳化指南：使用 TRL 進行 DPO 微調★ 75
Hugging Face Blog704 days agoTutorial
As vision-language models (VLMs) are increasingly applied to multimodal tasks, how to make these models produce outputs that better align with human…
使用開源 LLM 實作憲政 AI (Constitutional AI)：Hugging Face 的對齊新指南★ 78
Hugging Face Blog864 days agoTutorial
This blog post from Hugging Face provides an in-depth exploration of how to implement "Constitutional AI (CAI)" using open-source large language models (Open…
使用直接偏好最佳化 (DPO) 方法對 LLM 進行偏好微調 (Preference Tuning)★ 80
Hugging Face Blog878 days agoTutorial
This technical blog post from Hugging Face takes an in-depth look at the latest techniques in "preference tuning," with a particular focus on **Direct…
2023 年：開源大語言模型（Open LLMs）爆發之年★ 75
Hugging Face Blog909 days agoCommentary
Looking back on 2023, the most notable trend in the AI landscape was the explosive growth of open-source large language models (Open LLMs). In this annual…
使用 DPO 微調 Llama 2：Hugging Face TRL 實作指南★ 80
Hugging Face Blog1,041 days agoTutorial
### Background and Pain Points Traditional RLHF (Reinforcement Learning from Human Feedback), while achieving enormous success with models like ChatGPT…