Replicate has published its technical newsletter, Replicate Intelligence #4, summarizing recent major developments in the AI field as well as the latest…
Hugging Face has announced the official launch of the "Open Medical-LLM Leaderboard" in collaboration with researchers from Open Life Science AI and the…
As code large language models (Code LLMs) develop rapidly, fairly and accurately evaluating their capabilities has become a major challenge. Traditional…
Hugging Face has announced the launch of a new multimodal benchmark and leaderboard called "ConTextual," aimed at addressing the shortcomings of existing…
### Background: The Shortcomings of Static Safety Evaluations As large language models (LLMs) are widely adopted across industries, AI safety has become an…
Hugging Face has announced the launch of the new **NPHardEval** leaderboard — a benchmark specifically designed to evaluate the reasoning capabilities of large…
Hugging Face has partnered with Patronus AI — a startup focused on LLM evaluation and defense — to officially launch the **Enterprise Scenarios Leaderboard**…
While large language models (LLMs) have demonstrated remarkable generative capabilities across many domains, "hallucination" — where a model confidently…
### Introduction: Capability Is Not Safety — A New Benchmark for LLM Safety Evaluation As large language models (LLMs) are adopted more deeply across…
In the development of large language models (LLMs), RLHF (Reinforcement Learning from Human Feedback) is the critical step for aligning models with human…
In the machine learning field, deploying research-stage models to production environments — such as packaging them into Docker containers or deploying them to…
This article explains how to accelerate the deployment and inference of Hugging Face Transformers models using AWS Inferentia2 (Inf2 instances) — AWS's…
Amid the generative AI wave sparked by ChatGPT, Hugging Face published this in-depth article exploring how to transform "base language models" — which can only…
The release of ChatGPT in late 2022 triggered an explosion in generative AI, and the most critical technology behind it is Reinforcement Learning from Human…
In December 2022, Elixir language creator José Valim and Hugging Face jointly announced a transformative project for the Elixir community: Bumblebee. The…
In the field of natural language generation (NLG), enabling language models to produce coherent and natural long-form text has long been a major challenge…
As language model scales continue to expand, the memory (VRAM) of a single GPU has long been unable to accommodate models with tens or hundreds of billions of…
This classic Hugging Face blog post documents the birth of the "CodeParrot" project — an experiment in training a code generation model entirely from scratch…
In late 2021, the AI field witnessed an unprecedented explosive growth in large language models (LLMs). From OpenAI's GPT-3 at 175 billion parameters to the…
In the field of natural language processing (NLP), sequence-to-sequence (Seq2Seq) models — such as those used for translation or summarization — typically…