Latest in AI

Showing:edge-aiResearchersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 spacex3 amazon3 national-security2 open-source2 governance2 ai-regulation2 government-policy2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Bonsai LM 1-bit and 1.58-bit Benchmarks on Jetson Orin Nano Super
r/LocalLLaMA top day4 days agoBenchmark
A LocalLLaMA post benchmarks five Bonsai LM models, from 1.7B to about 8B parameters, on a $250 Jetson Orin Nano Super 8GB using llama.cpp CUDA. The tests compare 7W, 15W, 25W, and MAXN modes across latency, throughput, energy per token, and thermals. The main takeaway is that 25W is usually the best efficiency/performance point for models up to 4B, while Bonsai-8B may favor 15W for lower power.
Jetson Orin NX Build for Hermes Agent + Benchmarking
r/LocalLLaMA top day5 days agoHardware
The post describes turning an unused Jetson Orin NX into a compact local LLM server for Hermes Agent testing. The goals were low noise, over 10 tok/s generation, 300 tok/s prompt processing, at least 65K context, and a custom case. After testing Gemma 4, Qwen 3.6, and many quant variants, the author reports Gemma 4 26B A4B UD Q2_K_XL reaching 66K context and 10.21 tok/s near 60K context.
A 4B Edge-Deployable Cognitive Model Built in China
量子位 QbitAI5 days agoRelease
QbitAI’s headline says a domestic Chinese team has built a 4B-parameter “cognitive model” suitable for edge deployment. The framing links it to a model direction previously associated with Andrej Karpathy. Since the article body was not provided, details such as the model name, architecture, benchmark results, hardware requirements, open-source status, and licensing remain unverified.
llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants
r/LocalLLaMA top day5 days agoRelease
The Reddit post links to ggml-org/llama.cpp Pull Request #24282, which adds MTP support for Gemma-4 E2B and E4B assistants. The submitter frames it as useful for tiny Gemma models on phones, low-end machines, Raspberry Pi, or similarly constrained devices. The post does not include benchmarks, merge status, or setup instructions, so it should be treated as a development signal rather than a finished release.
Introducing Mistral 3★ 84
Mistral AI News6 days agoRelease
Mistral AI introduced Mistral 3, a new open model family under Apache 2.0. It includes Mistral Large 3, a 675B-parameter sparse MoE with 41B active parameters, plus Ministral 3 models at 3B, 8B, and 14B. The release targets frontier open-weight use, multimodal and multilingual workflows, enterprise customization, and efficient local or edge deployments.
Introducing Mistral 3★ 78
Mistral AI News6 days agoRelease
Mistral AI introduced Mistral 3, a new open model family including Mistral Large 3 and Ministral 3 models at 3B, 8B, and 14B sizes. Large 3 is a 675B-parameter sparse MoE model with 41B active parameters, while Ministral 3 targets local and edge use cases. The models are released under Apache 2.0 and are available through Mistral AI Studio, Hugging Face, Amazon Bedrock, and other platforms.
Best Local TTS Solution
r/LocalLLaMA top day6 days agoCommentary
A r/LocalLLaMA user says they have tested many local TTS tools, but none match ElevenLabs for expressiveness, voices, and cloning. They list moss-nano and Kokoro as the best edge-device candidates so far, with edgeTTS as a free/cloud option. The post asks for community experience connecting agents such as Hermes, openclaw, or opencode to Telegram voice notes or real-time voice conversations.
Clustering 3x Jetson Nano Orin Supers for Distributed AI
r/LocalLLaMA top day7 days agoTutorial
A developer has shared a practical guide on clustering three NVIDIA Jetson Nano Orin Super boards, leveraging their Ampere CUDA cores and unified memory. This project is part of 'smolcluster,' an initiative to make distributed AI training and inference accessible using everyday hardware like Macs, Raspberry Pis, and Jetsons. The series aims to explore whether heterogeneous clusters (mixing different hardware architectures) can effectively run local LLMs.
Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency★ 72
Hacker News (AI keywords)9 days agoRelease
Google released new Gemma 4 checkpoints optimized with Quantization-Aware Training to preserve quality after compression. The release includes Q4_0 checkpoints and a mobile-focused quantization format that can reduce Gemma 4 E2B memory use to about 1GB, or below 1GB for a text-only configuration. The models are available through Hugging Face and supported across llama.cpp, Ollama, LM Studio, LiteRT-LM, Transformers.js, SGLang, vLLM, MLX, and Unsloth.
NXP Computex 2026 Keynote: Neural Axis for Physical AI Hardware
INSIDE 硬塞 AI11 days agoHardware
At Computex 2026, NXP focused on Physical AI and introduced its Neural Axis architecture for edge devices. The architecture emphasizes low latency, high security, and hardware-based trust for real-time responses. The article frames this as important for robotics, autonomous vehicles, and other physical-world AI deployments where safe operation is essential.
NVIDIA Space Computing Gets First Hardware Case as Aitech Integrates IGX Thor
INSIDE 硬塞 AI17 days agoHardware
Aitech announced it will integrate NVIDIA IGX Thor into its space supercomputer for low Earth orbit missions. The goal is to provide onboard AI edge computing and enable real-time inference directly in orbit. By processing more data in space, the system aims to reduce dependence on ground communications and extend AI compute beyond Earth-based infrastructure.
烏克蘭無人機創辦人 Yaroslav Azhnyuk 談自主無人機技術棧與無人機經濟學：西方國家正處於昏睡狀態
Latent Space27 days agoCommentary
In this episode of the Latent Space podcast, the hosts and guest host Noah Smith (author of the well-known economics and technology blog Noahpinion)…
Google 發表 Gemma 4：專為裝置端設計的前沿多模態開放模型★ 85
Hugging Face Blog73 days agoRelease
Google and Hugging Face have jointly announced a new generation of open-weight models — "Gemma 4." This model represents a major breakthrough in on-device AI…
IBM 推出 Granite 4.0 3B Vision：專為企業文件設計的輕量級多模態 AI 模型★ 75
Hugging Face Blog75 days agoRelease
IBM has officially launched its new lightweight multimodal model on Hugging Face — the Granite 4.0 3B Vision. With 3 billion (3B) parameters, this model is…
Import AI 448：AI 研發趨勢、ByteDance 的 CUDA 寫作 Agent、衛星邊緣 AI 與 AI 戰爭的未來★ 75
Import AI (Jack Clark)97 days agoCommentary
This issue of Import AI 448, written by Jack Clark, takes a deep dive into the latest developments in AI R&D, automated hardware optimization, and the…
Hugging Face 聯手 NXP：將機器人 AI 帶入嵌入式平台（資料集錄製、VLA 微調與裝置端優化）★ 75
Hugging Face Blog101 days agoTutorial
Hugging Face has entered into a deep collaboration with semiconductor giant NXP (NXP Semiconductors), aimed at solving the challenge of deploying advanced…
GGML 與 llama.cpp 正式加入 Hugging Face，攜手保障本地端 AI 的長期發展★ 95
Hugging Face Blog114 days agoBusiness
A historic milestone has arrived in the open-source AI world: GGML and llama.cpp — the open-source projects founded by Georgi Gerganov that laid the…
開放評測標準：使用 NeMo Evaluator 基準測試 NVIDIA Nemotron 3 Nano★ 70
Hugging Face Blog179 days agoTutorial
As large language models (LLMs) develop in two divergent directions — with extremely large cloud-based models at one end and lightweight "Nano"-scale models…
探討全球算力格局的轉變：Hugging Face 剖析 AI 基礎設施的未來★ 75
Hugging Face Blog228 days agoOpinion
Against the backdrop of explosive global growth in artificial intelligence, compute has become the core resource that determines technological competitiveness…
如何使用 NVIDIA Isaac 醫療平台打造醫療機器人：從模擬到部署的完整指南★ 70
Hugging Face Blog228 days agoTutorial
As healthcare demands increase and medical staffing shortages worsen, the development of medical robots — such as robots for ward supply delivery, assisted…
Granite 4.0 Nano：探索端側 AI 的極限，模型究竟能縮到多小？★ 75
Hugging Face Blog229 days agoRelease
This article, jointly published by IBM and Hugging Face, delves into the technical details and application scenarios of the brand-new ultra-lightweight model…
Google DeepMind 推出 Gemma 3 270M：專為超高效能 AI 設計的極致輕量級模型★ 72
Google DeepMind Blog234 days agoRelease
Google DeepMind has officially announced the addition of a highly distinctive and specialized new member to its open-source model family — Gemma 3 270M. This…
只要三個簡單步驟，就能在 Intel CPU 上運行 VLM 視覺語言模型★ 70
Hugging Face Blog242 days agoTutorial
Visual Language Models (VLMs) combine computer vision with natural language processing, enabling complex tasks such as image captioning and visual question…
Arm 將參展 PyTorch Conference，展示 Arm 架構上的 AI 推論與 KleidiAI 優化技術
Hugging Face Blog247 days agoBusiness
Arm has officially announced on the Hugging Face blog that it will actively participate in the upcoming PyTorch Conference. As the Arm architecture gains…
在 Intel Core Ultra 上利用深度剪枝草稿模型加速 Qwen3-8B Agent★ 75
Hugging Face Blog258 days agoTutorial
As AI Agent applications become increasingly widespread, running large language models (LLMs) efficiently on personal computers (such as AI PCs powered by…
介紹 Palmyra-mini 系列：強大、輕量且具備推理能力的全新模型！★ 72
Hugging Face Blog275 days agoRelease
Writer, a leading provider of enterprise AI solutions, has officially announced the launch of its new "Palmyra-mini" model series on the Hugging Face platform…
大眾智能（Mass Intelligence）：從 GPT-5 到邊緣小模型，強大 AI 正在走向普及化★ 85
One Useful Thing (Mollick)289 days agoOpinion
In this article exploring "Mass Intelligence," University of Pennsylvania Wharton School professor Ethan Mollick reveals an imminent future: high-level…
Arm 與 ExecuTorch 0.7 聯手：將生成式 AI 推向大眾市場★ 80
Hugging Face Blog305 days agoRelease
As generative AI advances rapidly, deploying massive models to resource-constrained edge devices — such as smartphones, smart hardware, and AI PCs — has become…
Arm 與 Hugging Face 聯手推出「Neural Super Sampling」！加速行動端與邊緣設備的 AI 圖像超取樣★ 75
Hugging Face Blog306 days agoRelease
Arm and Hugging Face have announced a collaboration to launch "Neural Super Sampling (NSS)" technology and related models, officially bringing AI-driven image…
NVIDIA Llama Nemotron Nano VLM 正式登陸 Hugging Face Hub★ 75
Hugging Face Blog351 days agoRelease
NVIDIA has partnered with Hugging Face to officially bring its latest lightweight vision-language model (VLM) — the **NVIDIA Llama Nemotron Nano VLM** — to the…

Page 1Next →

Latest in AI

Bonsai LM 1-bit and 1.58-bit Benchmarks on Jetson Orin Nano Super

Jetson Orin NX Build for Hermes Agent + Benchmarking

A 4B Edge-Deployable Cognitive Model Built in China

llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants

Introducing Mistral 3★ 84

Introducing Mistral 3★ 78

Best Local TTS Solution

Clustering 3x Jetson Nano Orin Supers for Distributed AI

Gemma 4 QAT models: Optimizing model compression for mobile and laptop efficiency★ 72

NXP Computex 2026 Keynote: Neural Axis for Physical AI Hardware

NVIDIA Space Computing Gets First Hardware Case as Aitech Integrates IGX Thor

烏克蘭無人機創辦人 Yaroslav Azhnyuk 談自主無人機技術棧與無人機經濟學：西方國家正處於昏睡狀態

Google 發表 Gemma 4：專為裝置端設計的前沿多模態開放模型★ 85

IBM 推出 Granite 4.0 3B Vision：專為企業文件設計的輕量級多模態 AI 模型★ 75

Import AI 448：AI 研發趨勢、ByteDance 的 CUDA 寫作 Agent、衛星邊緣 AI 與 AI 戰爭的未來★ 75

Hugging Face 聯手 NXP：將機器人 AI 帶入嵌入式平台（資料集錄製、VLA 微調與裝置端優化）★ 75

GGML 與 llama.cpp 正式加入 Hugging Face，攜手保障本地端 AI 的長期發展★ 95

開放評測標準：使用 NeMo Evaluator 基準測試 NVIDIA Nemotron 3 Nano★ 70

探討全球算力格局的轉變：Hugging Face 剖析 AI 基礎設施的未來★ 75

如何使用 NVIDIA Isaac 醫療平台打造醫療機器人：從模擬到部署的完整指南★ 70

Granite 4.0 Nano：探索端側 AI 的極限，模型究竟能縮到多小？★ 75

Google DeepMind 推出 Gemma 3 270M：專為超高效能 AI 設計的極致輕量級模型★ 72

只要三個簡單步驟，就能在 Intel CPU 上運行 VLM 視覺語言模型★ 70

Arm 將參展 PyTorch Conference，展示 Arm 架構上的 AI 推論與 KleidiAI 優化技術

在 Intel Core Ultra 上利用深度剪枝草稿模型加速 Qwen3-8B Agent★ 75

介紹 Palmyra-mini 系列：強大、輕量且具備推理能力的全新模型！★ 72

大眾智能（Mass Intelligence）：從 GPT-5 到邊緣小模型，強大 AI 正在走向普及化★ 85

Arm 與 ExecuTorch 0.7 聯手：將生成式 AI 推向大眾市場★ 80

Arm 與 Hugging Face 聯手推出「Neural Super Sampling」！加速行動端與邊緣設備的 AI 圖像超取樣★ 75

NVIDIA Llama Nemotron Nano VLM 正式登陸 Hugging Face Hub★ 75