The slow autoregressive generation speed of large language models (LLMs) has long been a major bottleneck in real-world deployment. While "speculative…
In the deployment and inference of large language models (LLMs), reducing generation latency has always been a critical challenge. The traditional approach of…
Cohere For AI (C4AI) has officially launched Aya Expanse, a family of open-weight models designed specifically for multilingual tasks. The family includes two…
On October 23, 2024, Google and Hugging Face jointly announced the open-sourcing of Google's "SynthID Text" technology and its integration into Hugging Face's…
This article from the Hugging Face blog takes an in-depth look at how China's artificial intelligence forces have successfully gone global in recent years…
The Technology Innovation Institute (TII) of Abu Dhabi has officially released Falcon Mamba 7B, a significant milestone in the evolution of AI architectures…
Replicate has published its eighth issue of technical intelligence (Replicate Intelligence #8), bringing three major updates for developers: 1. **Top…
Meta's Llama 3.1 represents a major milestone in the open-source AI landscape. The most notable model is the 405B (405 billion parameter) version — the first…
On July 23, 2024, Meta officially released the highly anticipated Llama 3.1 405B — one of the most powerful open-source large language models in the world…
Google has officially launched the next generation of its open-source large language model, Gemma 2, with an initial release in two sizes — 9B (9 billion…
The Technology Innovation Institute (TII) of Abu Dhabi has officially released a new open-source model family on Hugging Face — Falcon 2 11B. This model, with…
Snowflake recently launched a brand-new open-source large language model called "Snowflake Arctic" — a Mixture of Experts (MoE) model designed for…
Meta officially released Llama 3, the next generation of its open-source large language models, on April 18, 2024. The initial release includes two parameter…
Google and Hugging Face have jointly announced the launch of CodeGemma, a family of lightweight open-source large language models (LLMs) designed specifically…
Hugging Face has officially released Cosmopedia, currently the largest and fully open-source synthetic dataset designed for the pre-training of large language…
Hugging Face has officially introduced Quanto, a brand-new quantization library designed for PyTorch, which has been integrated as a backend into the Hugging…
Hugging Face has announced a deep partnership with NVIDIA to directly integrate NVIDIA DGX Cloud services into the Hugging Face platform. This collaboration…
The BigCode community, jointly led by Hugging Face and ServiceNow, together with NVIDIA, has officially announced the launch of a new generation of open-source…
This guide from Hugging Face systematically introduces the technical principles, categories, existing tools, and real-world challenges of AI watermarking. As…
After Google released the Gemma family of open-source models (including 2B and 7B parameter versions), Hugging Face promptly published this practical…
Google has officially released a new family of open-source large language models called "Gemma" — a series of lightweight, state-of-the-art open-source models…
Looking back on 2023, the most notable trend in the AI landscape was the explosive growth of open-source large language models (Open LLMs). In this annual…
French AI startup Mistral AI officially released its highly anticipated open-source Mixture of Experts (MoE) model — Mixtral 8x7B. The model caused a sensation…
Mixture of Experts (MoE) has become a core technology for improving the performance and efficiency of today's large language models (LLMs). Traditional "dense…
Hugging Face announced the launch of a new open-source library called "Optimum-NVIDIA," the result of a deep collaboration with NVIDIA, aimed at seamlessly…
As large language models (LLMs) such as Llama 2 become more widely adopted, achieving efficient and cost-effective inference in production environments has…
As the parameter count of large language models (LLMs) has grown dramatically, running and fine-tuning these models on consumer-grade GPUs or limited hardware…
The Technology Innovation Institute (TII) in Abu Dhabi, UAE has officially released what is currently the largest openly accessible large language model on…
Meta has officially launched Code Llama, a family of state-of-the-art open-source code generation models fine-tuned on Llama 2. Code Llama achieves leading…
Meta's Llama 2 represents a landmark milestone in the history of open-source large language model (LLM) development. Its performance was regarded at the time…