As large language models (LLMs) have rapidly advanced, traditional static benchmarks (such as MMLU) have increasingly faced saturation and gaming problems. As…
This case study provides a detailed account of how non-profit organization Digital Green, with support from Hugging Face's Expert Support team, optimized its…
Hugging Face has officially launched HUGS (Hugging Face Microservices), a brand-new microservices solution designed to address the pain points enterprises face…
Meta's Llama 3.2 release includes lightweight 1B and 3B text models designed specifically for edge computing and mobile devices. These models have now been…
As generative AI applications become more widespread, one of the biggest challenges developers face is the "non-deterministic" output of large language models…
AMD has officially launched its 5th-generation EPYC processor, codenamed "Turin," and Hugging Face has promptly published a blog post detailing the deep…
Meta has officially introduced the Llama 3.2 family of open-source models, marking a significant architectural upgrade with two major breakthroughs: multimodal…
The deployment of large language models (LLMs) has long faced a dual bottleneck of VRAM capacity and memory bandwidth. Microsoft previously introduced the…
Hugging Face has officially introduced the "Community Tools" feature to its open-source chat platform, HuggingChat. This major update injects powerful Agent…
Meta's Llama 3.1 405B is one of the most powerful open-source large language models available today, but its massive parameter count (405 billion) poses…
GGML is a lightweight, zero-dependency C/C++ tensor library developed by Georgi Gerganov. It was originally designed to enable efficient local inference of the…
### Background and Pain Points In AI agent development, "tool use" (also known as function calling) is the core capability that allows large language models…
Hugging Face and NVIDIA announced a major partnership in late July 2024, officially launching a serverless inference service powered by NVIDIA NIM (NVIDIA…
Replicate has published its eighth issue of technical intelligence (Replicate Intelligence #8), bringing three major updates for developers: 1. **Top…
Meta's Llama 3.1 represents a major milestone in the open-source AI landscape. The most notable model is the 405B (405 billion parameter) version — the first…
On July 23, 2024, Meta officially released the highly anticipated Llama 3.1 405B — one of the most powerful open-source large language models in the world…
The Hugging Face official blog has introduced a major update to its open-source text generation inference engine, Text Generation Inference (TGI): the…
In the AI field, quickly building a chatbot that can accurately answer questions about a specific domain or newly released software has always been a major…
Hugging Face officially announced a deep integration with KerasHub — the new unified library for natural language processing (NLP) and computer vision (CV) in…
The Hugging Face team published a blog post announcing that their Code Agent, developed using the `transformers` library, achieved a breakthrough score on the…
This issue of Replicate Intelligence #3 brings curated content on three core themes for developers and AI enthusiasts: 1. **Garden State Llama**: This is a…
Hugging Face has announced the launch of "NPC-Playground," a 3D interactive sandbox environment designed to showcase and test non-player characters (NPCs)…
This Replicate technical digest (Intelligence #1) compiles three of the most talked-about technical breakthroughs and open-source projects in the AI community…
As large language models (LLMs) become increasingly prevalent in software development and automated workflows, their "dual-use" risks in the cybersecurity…
Hugging Face has announced official support for AWS Inferentia2 (Inf2) instances within its hosted Inference Endpoints service. This update gives developers…
Hugging Face and Dell Technologies have announced the launch of the "Dell Enterprise Hub," a new solution designed for enterprise on-premise AI deployment. As…
During Microsoft Build 2024, Hugging Face announced a further strategic collaboration with Microsoft, aimed at providing developers with a more seamless…
With the explosive growth of generative AI, demand for high-performance GPUs has reached an unprecedented level. To break hardware monopolies and reduce AI…
As enterprise demand for Retrieval-Augmented Generation (RAG) technology surges, how to maintain high performance while controlling hardware costs has become…
Hugging Face has announced a partnership with the independent AI performance analytics firm Artificial Analysis, officially integrating its "LLM Performance…