In the field of machine learning, "knowledge distillation" is a well-established technique that generally refers to using the output data generated by a…
Anthropic recently published research on "distillation attacks," defining the practice of external developers using its API outputs to train other models as a…
This article by Nathan Lambert takes a deep dive into the tangled competitive dynamics between open-source and closed-source AI models. Lambert argues that…
With the successive emergence of models with powerful "reasoning" capabilities — such as OpenAI o1, o3, and DeepSeek-R1 — the challenge of reducing the…
### Project Background: Recreating the Open-Source Miracle of DeepSeek-R1 The emergence of DeepSeek-R1 sent shockwaves through the global AI community…
With the explosion of foundation models and large language models (LLMs), enterprises are eager to incorporate these powerful technologies into real-world…
In this technical blog post, the Hugging Face team reveals in detail how they achieved up to 100x speedup in inference for Transformer models for customers of…