Latest in AI

Showing:vllmDevelopersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Releasing Cohere North Mini Code
r/LocalLLaMA top day5 days agoRelease
Cohere’s Jay Alammar announced the official release of North Mini Code after early community feedback from r/LocalLLaMA. Weights are available on Hugging Face, including an fp8 version, and the model can be tried for free through OpenCode. For vLLM deployment, Cohere recommends using vLLM main for now and installing cohere_melody for accurate response parsing, while noting community requests for quantization and llama.cpp support.
vLLM V0 到 V1 的演進：在強化學習（RL）中「正確性重於修正」的實踐★ 75
Hugging Face Blog39 days agoOpinion
This blog post published by the ServiceNow AI team delves into the major transition of the open-source large language model inference engine vLLM from V0 to…
讓 GPU 毫無閒置：利用 TRL 中協同部署的 vLLM 解鎖高效能強化學習訓練★ 85
Hugging Face Blog376 days agoRelease
In the reinforcement learning from human feedback (RLHF) training process for large language models — whether PPO or the recently popular GRPO — there are…
效率化請求佇列：優化 LLM 推論效能的關鍵策略★ 75
Hugging Face Blog438 days agoTutorial
### The Unique Challenges and Memory Bottlenecks of LLM Inference Traditional web services primarily handle concurrent requests through multi-threading or…
Hugging Face TGI 宣布支援多後端引擎：整合 TensorRT-LLM 與 vLLM★ 85
Hugging Face Blog514 days agoRelease
Text Generation Inference (TGI), Hugging Face's open-source LLM inference and deployment framework, has received a major architectural update, officially…
Outlines-core 0.1.0 正式發布：支援 Rust 與 Python 的高效能結構化生成庫★ 75
Hugging Face Blog600 days agoRelease
In LLM application development, ensuring that a model outputs content that 100% conforms to a specific format — such as a JSON Schema, a regular expression, or…
在生產環境中優化你的大語言模型 (LLM) — Hugging Face 實戰指南★ 85
Hugging Face Blog1,003 days agoTutorial
This technical guide from Hugging Face systematically introduces the core strategies for deploying and optimizing large language models (LLMs) in production…