Latest in AI

Showing:inferenceResearchersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

在 Intel Gaudi 上使用 TGI 加速大型語言模型（LLM）推理★ 75
Hugging Face Blog443 days agoRelease
Hugging Face's official blog has announced that its widely adopted open-source large model inference framework, Text Generation Inference (TGI), now officially…
Hugging Face Inference Endpoints 推出全新分析儀表板，全面提升模型監控與 MLOps 體驗
Hugging Face Blog450 days agoRelease
Hugging Face recently announced a major upgrade to its hosted model deployment service, "Inference Endpoints," introducing a brand-new and far more modern…
Hugging Face 推出三家全新無伺服器推論服務商：Hyperbolic、Nebius AI Studio 與 Novita AI★ 75
Hugging Face Blog481 days agoRelease
On February 18, 2025, Hugging Face announced the addition of three new partners to its serverless inference ecosystem: Hyperbolic, Nebius AI Studio, and Novita…
歡迎 Fireworks.ai 加入 Hugging Face Hub 🎆★ 75
Hugging Face Blog485 days agoRelease
On February 14, 2025, Hugging Face — the leading open-source AI community — officially announced the integration of high-performance AI inference platform…
10 億次分類的啟示：Hugging Face 分享如何用開源模型極速且超低成本完成大規模分類任務★ 80
Hugging Face Blog486 days agoTutorial
In the current era of generative AI sweeping the globe, many developers habitually feed all tasks — including simple text classification, sentiment analysis…
如何在 AWS 上部署與微調 DeepSeek 模型：Hugging Face 官方指南★ 85
Hugging Face Blog500 days agoTutorial
As DeepSeek-R1 swept through the AI landscape on the strength of its powerful reasoning capabilities, how to safely and efficiently deploy and fine-tune these…
Hugging Face Hub 推出「Inference Providers」：一鍵切換多個第三方高效能推理服務商★ 85
Hugging Face Blog502 days agoRelease
Hugging Face has officially launched the "Inference Providers" feature on the Hugging Face Hub — a major update designed to address the pain points developers…
Hugging Face TGI 宣布支援多後端引擎：整合 TensorRT-LLM 與 vLLM★ 85
Hugging Face Blog514 days agoRelease
Text Generation Inference (TGI), Hugging Face's open-source LLM inference and deployment framework, has received a major architectural update, officially…
Replicate 正式支援 NVIDIA L40S GPU：性能更佳、成本更低
Replicate Blog576 days agoNew Tool
The AI deployment platform Replicate has announced the official availability of NVIDIA L40S GPU compute on its platform. This update aims to provide developers…
微調 LLM 至 1.58-bit：讓極限模型量化變得簡單★ 85
Hugging Face Blog634 days agoTutorial
The deployment of large language models (LLMs) has long faced a dual bottleneck of VRAM capacity and memory bandwidth. Microsoft previously introduced the…
GGML 基礎入門介紹：讓大語言模型在消費級硬體上高效運行的關鍵技術★ 80
Hugging Face Blog670 days agoTutorial
GGML is a lightweight, zero-dependency C/C++ tensor library developed by Georgi Gerganov. It was originally designed to enable efficient local inference of the…
Hugging Face 聯手 NVIDIA NIM 推出無伺服器推論服務 (Serverless Inference)★ 82
Hugging Face Blog685 days agoRelease
Hugging Face and NVIDIA announced a major partnership in late July 2024, officially launching a serverless inference service powered by NVIDIA NIM (NVIDIA…
TGI Multi-LoRA：部署一次即可同時提供 30 個微調模型服務★ 80
Hugging Face Blog696 days agoRelease
The Hugging Face official blog has introduced a major update to its open-source text generation inference engine, Text Generation Inference (TGI): the…
Google Cloud TPU 正式登陸 Hugging Face，支援 Inference Endpoints 與 Spaces★ 75
Hugging Face Blog705 days agoRelease
Hugging Face announced a deep partnership with Google Cloud, officially integrating Google Cloud TPUs (Tensor Processing Units) into the Hugging Face platform…
NVIDIA H100 GPU 即將登陸 Replicate：支援更快速的模型推理與訓練
Replicate Blog732 days agoRelease
The official blog of Replicate, the popular AI model hosting and deployment platform, has announced that NVIDIA H100 Tensor Core GPUs will soon be officially…
評測 Text Generation Inference (TGI)：如何量化與優化大語言模型推理性能★ 75
Hugging Face Blog746 days agoTutorial
This official Hugging Face blog post takes an in-depth look at how to benchmark Text Generation Inference (TGI), Hugging Face's open-source LLM inference and…
在 Hugging Face 上輕鬆將模型部署至 AWS Inferentia2 晶片★ 75
Hugging Face Blog753 days agoRelease
Hugging Face has announced official support for AWS Inferentia2 (Inf2) instances within its hosted Inference Endpoints service. This update gives developers…
使用 Intel Gaudi 2 與 Intel Xeon 建構高性價比的企業級 RAG 應用★ 70
Hugging Face Blog766 days agoTutorial
As enterprise demand for Retrieval-Augmented Generation (RAG) technology surges, how to maintain high performance while controlling hardware costs has become…
在 Hugging Face Endpoints 上運行隱私保護的全同態加密 (FHE) 推理★ 75
Hugging Face Blog789 days agoRelease
This article introduces how to run privacy-preserving inference based on Fully Homomorphic Encryption (FHE) on Hugging Face Endpoints. In traditional…
Optimum-NVIDIA：只需一行程式碼，即可解鎖極速 LLM 推理★ 80
Hugging Face Blog922 days agoRelease
Hugging Face announced the launch of a new open-source library called "Optimum-NVIDIA," the result of a deep collaboration with NVIDIA, aimed at seamlessly…
告別冷啟動：Hugging Face 如何將 LoRA 推論速度提升 300%★ 85
Hugging Face Blog922 days agoRelease
In real-world generative AI applications, fine-tuning for specific tasks or clients is a common requirement. However, deploying a full base model for every…
讓你的 Llama 生成速度飛起來：使用 AWS Inferentia2 進行加速★ 72
Hugging Face Blog950 days agoTutorial
As large language models (LLMs) such as Llama 2 become more widely adopted, achieving efficient and cost-effective inference in production environments has…
使用 Hugging Face Inference Endpoints 輕鬆部署高效能嵌入模型★ 75
Hugging Face Blog964 days agoRelease
As large language models (LLMs) and Retrieval-Augmented Generation (RAG) technology become increasingly widespread, embedding models have become an…
Hugging Face 為 PRO 訂閱者推出專屬推理服務：更高速率、支援大型開源模型★ 70
Hugging Face Blog996 days agoRelease
The Hugging Face official blog has announced a new "Inference for PROs" upgraded service for PRO subscribers (at $9 per month). This service is designed to…
Hugging Face Transformers 原生支援量化方案全解析：bitsandbytes 與 GPTQ 實戰指南★ 75
Hugging Face Blog1,006 days agoTutorial
As the parameter count of large language models (LLMs) has grown dramatically, running and fine-tuning these models on consumer-grade GPUs or limited hardware…
Fetch 採用 Amazon SageMaker 與 Hugging Face，成功降低 50% 機器學習處理延遲
Hugging Face Blog1,017 days agoBusiness
This case study examines how Fetch, a leading consumer rewards platform in the United States, leveraged the collaboration between Amazon SageMaker and Hugging…
Hugging Face 的開源文本生成與 LLM 生態系全景指南★ 85
Hugging Face Blog1,063 days agoRelease
This official Hugging Face blog post systematically maps out the complete ecosystem it has built around open-source large language models (LLMs). As…
使用 Hugging Face Inference Endpoints 輕鬆部署大型語言模型 (LLM)★ 75
Hugging Face Blog1,076 days agoTutorial
This official Hugging Face blog post introduces how to use their hosted service "Inference Endpoints" to deploy large language models (LLMs). With the rapid…
Falcon 系列開源模型正式登陸 Hugging Face 生態系統★ 75
Hugging Face Blog1,105 days agoRelease
The Falcon series of large language models (including Falcon-40B and Falcon-7B), developed by Abu Dhabi's Technology Innovation Institute (TII), have…
使用 AWS Inferentia2 加速 Hugging Face Transformers 模型推理★ 70
Hugging Face Blog1,154 days agoRelease
This article explains how to accelerate the deployment and inference of Hugging Face Transformers models using AWS Inferentia2 (Inf2 instances) — AWS's…

← PreviousPage 2Next →

Latest in AI

在 Intel Gaudi 上使用 TGI 加速大型語言模型（LLM）推理★ 75

Hugging Face Inference Endpoints 推出全新分析儀表板，全面提升模型監控與 MLOps 體驗

Hugging Face 推出三家全新無伺服器推論服務商：Hyperbolic、Nebius AI Studio 與 Novita AI★ 75

歡迎 Fireworks.ai 加入 Hugging Face Hub 🎆★ 75

10 億次分類的啟示：Hugging Face 分享如何用開源模型極速且超低成本完成大規模分類任務★ 80

如何在 AWS 上部署與微調 DeepSeek 模型：Hugging Face 官方指南★ 85

Hugging Face Hub 推出「Inference Providers」：一鍵切換多個第三方高效能推理服務商★ 85

Hugging Face TGI 宣布支援多後端引擎：整合 TensorRT-LLM 與 vLLM★ 85

Replicate 正式支援 NVIDIA L40S GPU：性能更佳、成本更低

微調 LLM 至 1.58-bit：讓極限模型量化變得簡單★ 85

GGML 基礎入門介紹：讓大語言模型在消費級硬體上高效運行的關鍵技術★ 80

Hugging Face 聯手 NVIDIA NIM 推出無伺服器推論服務 (Serverless Inference)★ 82

TGI Multi-LoRA：部署一次即可同時提供 30 個微調模型服務★ 80

Google Cloud TPU 正式登陸 Hugging Face，支援 Inference Endpoints 與 Spaces★ 75

NVIDIA H100 GPU 即將登陸 Replicate：支援更快速的模型推理與訓練

評測 Text Generation Inference (TGI)：如何量化與優化大語言模型推理性能★ 75

在 Hugging Face 上輕鬆將模型部署至 AWS Inferentia2 晶片★ 75

使用 Intel Gaudi 2 與 Intel Xeon 建構高性價比的企業級 RAG 應用★ 70

在 Hugging Face Endpoints 上運行隱私保護的全同態加密 (FHE) 推理★ 75

Optimum-NVIDIA：只需一行程式碼，即可解鎖極速 LLM 推理★ 80

告別冷啟動：Hugging Face 如何將 LoRA 推論速度提升 300%★ 85

讓你的 Llama 生成速度飛起來：使用 AWS Inferentia2 進行加速★ 72

使用 Hugging Face Inference Endpoints 輕鬆部署高效能嵌入模型★ 75

Hugging Face 為 PRO 訂閱者推出專屬推理服務：更高速率、支援大型開源模型★ 70

Hugging Face Transformers 原生支援量化方案全解析：bitsandbytes 與 GPTQ 實戰指南★ 75

Fetch 採用 Amazon SageMaker 與 Hugging Face，成功降低 50% 機器學習處理延遲

Hugging Face 的開源文本生成與 LLM 生態系全景指南★ 85

使用 Hugging Face Inference Endpoints 輕鬆部署大型語言模型 (LLM)★ 75

Falcon 系列開源模型正式登陸 Hugging Face 生態系統★ 75

使用 AWS Inferentia2 加速 Hugging Face Transformers 模型推理★ 70