In the history of artificial intelligence, the appearance of the ImageNet dataset in 2012 is widely recognized as the key catalyst that ignited the deep…
### Background With the proliferation of vision-language models (VLMs), using VLMs for document OCR (e.g., converting PDFs to Markdown) has become mainstream…
Hugging Face's official blog announced that Cohere, the well-known enterprise AI research and development company, has officially joined Hugging Face's…
The Language Technologies department (BSC-LT) of the Barcelona Supercomputing Center (BSC) recently released a new open-source multimodal model on Hugging Face…
At NVIDIA GTC 2025, NVIDIA unveiled a remarkable set of new open-source models and datasets for the field of "Physical AI" — also known as embodied…
Since its launch, Hugging Face's Open R1 project has been dedicated to replicating the reasoning capabilities of DeepSeek-R1 in a fully open-source manner. In…
Alibaba's open-source Wan2.1 is a text-to-video model that has been attracting considerable attention. To help developers and creators get the most out of this…
Cohere For AI (C4AI) has officially launched "Aya Vision," a series of open-source multimodal models (available in 8B and 32B parameter versions) designed…
In the current era of generative AI sweeping the globe, many developers habitually feed all tasks — including simple text classification, sentiment analysis…
### Background and Pain Points As large language models (LLMs) have become widespread, the file sizes hosted on the Hugging Face Hub have grown dramatically…
Hugging Face has officially published the second technical update (Update #2) for the Open R1 project, which aims to replicate DeepSeek-R1's reasoning model…
Physical Intelligence, a Physical AI startup founded by robotics luminary Sergey Levine and others, has officially open-sourced its flagship robot foundation…
### Background and the Goals of the Open-R1 Project Since the release of DeepSeek-R1, its powerful reasoning capability and remarkably low training cost have…
As DeepSeek-R1 swept through the AI landscape on the strength of its powerful reasoning capabilities, how to safely and efficiently deploy and fine-tune these…
This official Hugging Face blog post takes an in-depth look at the current state of open-source video generation models within the Diffusers ecosystem. As…
### What Are Static Embeddings? In today's NLP landscape, Transformer-based embedding models (such as BERT and mE5) have become the mainstream, as they…
Hugging Face has recently released a new Visual Document Retrieval (VDR) model — **VDR-2B-multilingual**. This technology marks a formal transition in document…
Despite the recent dominance of generative decoder models (such as GPT and Llama), encoder-only models (such as BERT) remain indispensable behind the scenes…
In the history of AI development, the open-sourcing of Stable Diffusion in 2022 is regarded as a pivotal turning point in the field of image generation — it…
Hugging Face officially launched an open-source initiative called "LeMaterial" — a major project aimed at using artificial intelligence to accelerate materials…
The Hugging Face Hub currently hosts millions of AI models, datasets, and applications (Spaces), with total storage reaching the hundreds of petabytes. As the…
On October 23, 2024, Google and Hugging Face jointly announced the open-sourcing of Google's "SynthID Text" technology and its integration into Hugging Face's…
Stability AI officially launched the Stable Diffusion 3.5 (SD3.5) model series in late October 2024, and Hugging Face's Diffusers team simultaneously announced…
The well-known AI model hosting platform Replicate has announced a significant speed improvement for FLUX image generation models running on its platform. FLUX…
This article from the Hugging Face blog takes an in-depth look at how China's artificial intelligence forces have successfully gone global in recent years…
In the field of 3D generative AI (encompassing models such as InstantMesh and Tripo3D), generated 3D models typically represent color using "vertex coloring."…
Hugging Face has officially announced a partnership with the well-known cybersecurity company Truffle Security, integrating the open-source credential scanning…
### Background and Challenges In robotics (such as Embodied AI), imitation learning and reinforcement learning require collecting large volumes of robot…
This edition of Replicate Intelligence #11 compiles major recent technical breakthroughs and application trends in the generative AI space, focusing primarily…
This Hugging Face blog post provides a detailed account of the team's attempt to reproduce and evaluate Google's proposed "Infini-Attention" mechanism — and…