With the success of reasoning models such as DeepSeek-R1, reinforcement learning (RL/RLHF) has become a critical technique for improving the alignment and…
As large language models (LLMs) push the demand for long context toward the million-token scale, the VRAM of a single GPU can no longer accommodate the…
This article, published on the Hugging Face blog and authored by the LinkedIn team, is a practical retrospective whose core subject is how to unlock "Agentic…
Hugging Face's official blog recently published a major update announcing a comprehensive overhaul of the streaming mode in its core open-source library…
As "Sovereign AI" becomes a global trend, countries around the world are actively seeking to build AI models that reflect their own culture, values, and…
As the parameter counts of generative AI and large language models (LLMs) push into the tens and hundreds of billions, the memory of a single GPU has long been…
### Background and the Mystery of the "Aha Moment" Following the release of DeepSeek-R1, a wave of excitement around "Reasoning Models" swept the AI community…
### The Mathematical Flaw in Traditional Gradient Accumulation Gradient accumulation is an extremely common technique in deep learning. When VRAM is limited…
When fine-tuning or pre-training large language models (LLMs), the sequence lengths of input data are typically uneven. The traditional approach is to use…
As the parameter counts of large language models (LLMs) have skyrocketed, the hardware requirements for training and fine-tuning these models have risen…
This technical blog post from Hugging Face takes an in-depth look at the critical "implementation details" that are routinely glossed over in academic papers…
This case study takes an in-depth look at how Writer, an enterprise-grade generative AI platform, leverages the Hugging Face open-source ecosystem and…
This technical blog post from Hugging Face takes an in-depth look at the challenges the BigCode project (the collaborative initiative behind StarCoder) faced…
This case study introduces a deep technical collaboration between Databricks and Hugging Face, aimed at addressing the efficiency and cost challenges…
The release of ChatGPT in late 2022 triggered an explosion in generative AI, and the most critical technology behind it is Reinforcement Learning from Human…
As language model scales continue to expand, the memory (VRAM) of a single GPU has long been unable to accommodate models with tens or hundreds of billions of…
This official Hugging Face blog post provides a detailed walkthrough of how to combine the `Accelerate` library with Microsoft's `DeepSpeed` deep learning…
As the parameter scale of Transformer models (such as GPT, T5, etc.) grows exponentially, deep learning faces a severe "Memory Wall" challenge. With limited…