Meta's Llama 3.1 405B is one of the most powerful open-source large language models available today, but its massive parameter count (405 billion) poses…
GGML is a lightweight, zero-dependency C/C++ tensor library developed by Georgi Gerganov. It was originally designed to enable efficient local inference of the…
### Background and Pain Points In AI agent development, "tool use" (also known as function calling) is the core capability that allows large language models…
As generative AI develops rapidly, many enterprises are trying to move AI from the "proof of concept (PoC)" stage into actual production environments. Vercel…
Hugging Face and NVIDIA announced a major partnership in late July 2024, officially launching a serverless inference service powered by NVIDIA NIM (NVIDIA…
Replicate has published its eighth issue of technical intelligence (Replicate Intelligence #8), bringing three major updates for developers: 1. **Top…
Meta's Llama 3.1 represents a major milestone in the open-source AI landscape. The most notable model is the 405B (405 billion parameter) version — the first…
On July 23, 2024, Meta officially released the highly anticipated Llama 3.1 405B — one of the most powerful open-source large language models in the world…
The Hugging Face official blog has introduced a major update to its open-source text generation inference engine, Text Generation Inference (TGI): the…
In the AI field, quickly building a chatbot that can accurately answer questions about a specific domain or newly released software has always been a major…
Hugging Face officially announced a deep integration with KerasHub — the new unified library for natural language processing (NLP) and computer vision (CV) in…
The Hugging Face team published a blog post announcing that their Code Agent, developed using the `transformers` library, achieved a breakthrough score on the…
This issue of Replicate Intelligence #3 brings curated content on three core themes for developers and AI enthusiasts: 1. **Garden State Llama**: This is a…
Hugging Face has announced the launch of "NPC-Playground," a 3D interactive sandbox environment designed to showcase and test non-player characters (NPCs)…
This Replicate technical digest (Intelligence #1) compiles three of the most talked-about technical breakthroughs and open-source projects in the AI community…
As large language models (LLMs) become increasingly prevalent in software development and automated workflows, their "dual-use" risks in the cybersecurity…
Hugging Face has announced official support for AWS Inferentia2 (Inf2) instances within its hosted Inference Endpoints service. This update gives developers…
During Microsoft Build 2024, Hugging Face announced a further strategic collaboration with Microsoft, aimed at providing developers with a more seamless…
With the explosive growth of generative AI, demand for high-performance GPUs has reached an unprecedented level. To break hardware monopolies and reduce AI…
Hugging Face and Dell Technologies have announced the launch of the "Dell Enterprise Hub," a new solution designed for enterprise on-premise AI deployment. As…
As enterprise demand for Retrieval-Augmented Generation (RAG) technology surges, how to maintain high performance while controlling hardware costs has become…
Hugging Face has announced a partnership with the independent AI performance analytics firm Artificial Analysis, officially integrating its "LLM Performance…
When developing applications based on large language models (LLMs) — such as AI agents, RAG systems, or automated workflows — one of the biggest challenges…
Hugging Face has announced the official launch of the "Open Medical-LLM Leaderboard" in collaboration with researchers from Open Life Science AI and the…
Meta officially released the highly anticipated open-source large language model Llama 3 on April 18, 2024, in two sizes: 8B (8 billion parameters) and 70B (70…
Meta officially released Llama 3, the next generation of its open-source large language models, on April 18, 2024. The initial release includes two parameter…
This case study details how biomedical AI startup Ryght leveraged Hugging Face's Expert Support service to overcome the many challenges of deploying generative…
As code large language models (Code LLMs) develop rapidly, fairly and accurately evaluating their capabilities has become a major challenge. Traditional…
This tutorial article details how to build an efficient natural language to SQL (Text2SQL) query system using tools from the Hugging Face ecosystem and a…
Hugging Face and internet infrastructure giant Cloudflare have announced a major partnership that officially brings serverless GPU inference services to…