Hugging Face officially published Transformers.js v4 on NPM, marking a major milestone for running local AI models within the JavaScript ecosystem…
During Microsoft Build 2024, Hugging Face announced a further strategic collaboration with Microsoft, aimed at providing developers with a more seamless…
SD Turbo and SDXL Turbo are single-step/few-step text-to-image models from Stability AI, with their core innovation being Adversarial Diffusion Distillation…
Hugging Face officially announced a deep collaboration with Microsoft to integrate ONNX Runtime (ORT) into the Hugging Face ecosystem. This partnership enables…
This official Hugging Face blog post explores in depth how to use the Transformers.js library to run machine learning (ML) models directly in the browser…
As the scale of deep learning models (such as Transformers) continues to grow, training these models demands enormous computational resources and time. To help…
"Document AI" is a key driver of enterprise digital transformation in recent years, aimed at automating the processing of unstructured documents such as…
When deploying Transformer models in production, latency and throughput are typically the key factors determining the quality of the user experience. ONNX…
When deploying Transformer models in production, reducing inference latency and increasing throughput while keeping computational costs under control has…
This case study focuses on the performance of "Hugging Face Infinity" — Hugging Face's high-performance inference container solution — on modern CPUs…
In many real-world enterprise production environments, although GPUs offer extremely high throughput for deep learning inference, CPUs remain indispensable due…
In this technical blog post, the Hugging Face team reveals in detail how they achieved up to 100x speedup in inference for Transformer models for customers of…