The official Hugging Face blog has announced the launch of "Storage Buckets" on the Hugging Face Hub. This is a cloud object storage service designed…
Hugging Face's official blog recently published a major update announcing a comprehensive overhaul of the streaming mode in its core open-source library…
Hugging Face has officially released `LeRobotDataset:v3.0`, a critical technical upgrade to its open-source robot learning library `lerobot`, with the core…
Hugging Face has officially launched a new tool called "AI Sheets," an intuitive spreadsheet tool designed specifically for dataset processing. It aims to make…
### What Is Parquet Content-Defined Chunking (CDC)? In the AI and machine learning field, dataset sizes are growing at a staggering pace. Datasets on the…
With the rapid development of vision-language models (VLMs) and multimodal AI, the amount of data required to train these models has grown explosively…
### Background and Pain Points: The Limitations of Git LFS Hugging Face Hub, as the world's largest AI model and dataset hosting platform, has long relied on…
Hugging Face has announced a formal partnership with India's premier academic institution — the Indian Institute of Science (IISc) — with the core goal of…
Hugging Face officially launched an open-source initiative called "LeMaterial" — a major project aimed at using artificial intelligence to accelerate materials…
Hugging Face's official blog has published an article warmly inviting and encouraging machine learning (ML) researchers and developers worldwide to share their…
As the scale of AI models and the volume of training data grow dramatically, the computational capacity and memory (RAM) of a single machine often become…
The Hugging Face Hub, as the world's largest open-source AI community and dataset hosting platform, automatically converts datasets uploaded in various formats…
Hugging Face has officially announced the launch of a brand-new "SQL Console" feature on the Datasets pages of the Hugging Face Hub. This feature is designed…
### Background and Challenges In robotics (such as Embodied AI), imitation learning and reinforcement learning require collecting large volumes of robot…
Hugging Face, as the world's largest open-source AI community, has developed many powerful tools beyond its well-known Model Hub that often go unnoticed by…
As open-source AI has flourished, Hugging Face Hub has become the world's largest hosting platform for machine learning models and datasets. However…
Hugging Face's official blog announced in July 2024 the launch of new "Dataset Search and Filtering Features," aimed at addressing the pain point of precisely…
Hugging Face recently published its "Ethics and Society Newsletter #6," with this issue focused on the theme "Building Better AI: The Importance of Data…
### Background and Challenge: The High-Quality Data Bottleneck In the current development of generative AI and large language models (LLMs), the industry…
Hugging Face has published a practical guide specifically for the GLAM sector (Galleries, Libraries, Archives, and Museums), exploring how cultural heritage…
The official Hugging Face blog has announced an important integration: developers can now use DuckDB to directly analyze and query more than 50,000 open…
With the rapid growth of voice AI (such as Whisper), efficiently handling audio datasets has become critically important. This guide from the official Hugging…
Hugging Face officially announced the introduction of a DOI (Digital Object Identifier) mechanism for models and datasets hosted on the Hugging Face Hub. This…
Hugging Face announced new official Audio and Vision documentation guides for its core open-source library `datasets`. As multimodal AI models continue to…
Hugging Face's official blog has announced a major upgrade to its platform's core feature — the search functionality of Hugging Face Hub. As the open-source AI…
In late 2021, Hugging Face launched the "Data Measurements Tool," an open-source, interactive utility designed to address the problem of "dataset black boxes"…