Latest in AI

🔥 Trending today

anthropic3 open-source3 ipo2 export-controls2 privacy2 Regulation2 ai-governance2 ai-investment1 public-markets1 startup-ecosystem1

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Do agents.md files help coding agents?
Hacker News (AI keywords)7 days agoCommentary
The source only provides the title, so no conclusion or evidence can be verified. The topic appears to ask whether an agents.md file helps coding agents understand project conventions, commands, and constraints. This is relevant to developers adopting AI coding tools, but any claims about effectiveness would require the original post or supporting examples.
Community Discussion: Local Installation and Multilingual Training for Kokoro TTS
r/LocalLLaMA top day7 days agoCommentary
A LocalLLaMA subreddit post discusses challenges with Kokoro TTS's multilingual performance on cloud APIs. The author is seeking community advice on how to install Kokoro locally and train/fine-tune it for Brazilian Portuguese to achieve more natural-sounding speech.
Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL
r/LocalLLaMA top day7 days agoCommentary
An analysis of Gemma 4 QAT GGUF files reveals that Google's official 'Q4_0' releases actually employ a mixed-precision strategy. For smaller models like E2B and E4B, Google keeps critical token embeddings in Q6_K and certain projection weights in F16. This makes Google's Q4_0 files larger and more precise than Unsloth's 'Q4_K_XL' versions, which default to standard Q4_0 for almost all tensors.
DeepSeek enters the fight for token volume, Anthropic dominates spend
Vercel Changelog7 days agoBusiness
The title points to a split AI market: DeepSeek is competing for token volume, while Anthropic remains dominant in spend. That suggests high-volume, cost-sensitive workloads may be opening up to DeepSeek, while Claude-related usage may still anchor higher-value or higher-cost production tasks. Without the full article, exact shares, model versions, and trend data cannot be confirmed.
SDSU Wired Its Dorms with 1,300 AI Cameras Without Telling Students
Hacker News (AI keywords)7 days agoEthics
San Diego State University reportedly deployed around 1,300 AI-enabled cameras across campus, including roughly 330 tied to student dorm areas. The controversy centers on whether students were adequately informed and whether residential common areas should be treated as ordinary surveillance zones. With no full article text provided, the strongest reading is that this is an AI governance and privacy incident, not a model or product launch story.
When GPUs Turn from Cost Burden into Profit Engine, Enterprise AI Enters a New Game
INSIDE 硬塞 AI7 days agoBusiness
INFINITIX addresses low GPU utilization with software designed for enterprise AI infrastructure. Its AI-Stack uses virtualization and scheduling to maximize GPU efficiency and reduce idle compute. The ixCSP platform helps service providers turn compute capacity into operational cloud services, reframing GPUs from a cost burden into a potential revenue-generating asset.
Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75
r/LocalLLaMA top day7 days agoBenchmark
A Reddit user shared benchmark results showing Google's Gemma 4 31B (FP8) performing on par with Claude Sonnet 4.6 Medium. The custom evaluation harness tested complex tasks including Neo4j Cypher queries, entity extraction, agentic tool calling, Python coding, and multi-vector retrieval synthesis. This highlights how quantized mid-sized open-source models are closing the gap with leading proprietary frontier models.
NVIDIA and LG Group Build an AI Factory for Physical AI, Mobility and AI Infrastructure★ 74
NVIDIA Blog7 days agoHardware
NVIDIA and LG Group are collaborating on an AI factory to support LG’s AI-driven businesses across robotics, autonomous driving, data center technologies and GPU cloud services. The effort connects NVIDIA’s AI factory platform with LG’s manufacturing, mobility, robotics and infrastructure capabilities. It also covers Isaac, Cosmos, DRIVE, DSX and EXAONE-related work using Blackwell GPUs, NeMo, Nemotron datasets and TensorRT-LLM.
WWDC26 Countdown: Event Giveaway Revealed with Classic Dogcow Reference
INSIDE 硬塞 AI7 days agoCommentary
INSIDE’s short post frames WWDC26 through an event-exclusive giveaway tied to Apple nostalgia. The visible text focuses on Dogcow, the classic old Mac character whose sound is “Moof,” blending moo and woof. No AI model, developer tool, or product feature is described in the provided excerpt, so this is best read as Apple culture and event-merchandise coverage.
Jensen Huang Signs Korea Deals with SK Hynix, NAVER and Doosan
INSIDE 硬塞 AI7 days agoHardware
Nvidia announced partnerships with SK Hynix, NAVER and Doosan Group to bring its technology into AI data center projects in Korea. The collaboration also covers next-generation memory development, tying Nvidia more closely to Korea’s semiconductor and digital infrastructure ecosystem. The article does not specify investment size, deployment timeline or data center scale.
Best Local TTS Solution
r/LocalLLaMA top day7 days agoCommentary
A r/LocalLLaMA user says they have tested many local TTS tools, but none match ElevenLabs for expressiveness, voices, and cloning. They list moss-nano and Kokoro as the best edge-device candidates so far, with edgeTTS as a free/cloud option. The post asks for community experience connecting agents such as Hermes, openclaw, or opencode to Telegram voice notes or real-time voice conversations.
Algorithmic Monocultures in Hiring★ 78
Hacker News (AI keywords)7 days agoPaper
This study analyzes 3.4 million real applicants and 4 million applications across 156 U.S. employers. It finds position-level racial adverse impact that aggregate analysis can obscure, especially affecting Black and Asian applicants. The authors also show that reliance on a single vendor can create homogeneous outcomes and systemic rejections, calling for stronger audits, surveillance, and researcher access.
DeepSeek V4 Pro beats GPT-5.5 Pro on precision
Hacker News (AI keywords)7 days agoBenchmark
RuntimeWire compared DeepSeek V4 Pro and GPT-5.5 Pro across four fresh text tasks, with DeepSeek winning 38.0 to 33.0. The article highlights DeepSeek’s stronger handling of regex edge cases, workplace-update constraints, and exact JSON schema compliance. GPT-5.5 Pro remained capable, but lost points for avoidable deviations, extra process details, and minor structural mismatches.
A Matter Wi-Fi Light Bulb in Rust on the Raspberry Pi Pico 2 W
Hacker News (AI keywords)7 days agoHardware
This GitHub repository collects Rust Embassy examples for Raspberry Pi Pico 2 and Pico 2 W. Its Matter Wi-Fi light example uses rs-matter, BLE commissioning, and Wi-Fi connectivity so the board can appear as a standard smart bulb in Home Assistant, Apple Home, or Google Home. The project is mainly relevant to embedded Rust and smart-home developers, not AI model users.
User Shares Gemma 4 QAT Experience: Improved Quality and MTP Speedups
r/LocalLLaMA top day7 days agoOpinion
A Reddit user shared their experience with the Gemma 4 31B QAT (Quantization-Aware Training) model. Compared to traditional GGUF quants like Q6_K_L, the QAT version delivers noticeable quality improvements in roleplay and long-context tasks. Additionally, combining the QAT model with Multi-Token Prediction (MTP) yielded massive speedups, boosting generation speeds from ~20 t/s to up to 50 t/s.
The Open Source Community is backing OpenEnv for Agentic RL
Hugging Face Blog7 days agoCommentary
The title indicates that OpenEnv is being positioned around agentic reinforcement learning. The confirmed signal is community support from the open-source ecosystem, not specific technical claims. Without the full article, details such as contributors, features, integrations, benchmarks, or adoption status should be treated as unknown.
datasette-agent-edit 0.1a0
Simon Willison's Weblog7 days agoRelease
Simon Willison released datasette-agent-edit 0.1a0 as a base plugin for Datasette Agent. It is intended to support future plugins that edit existing text, including collaborative Markdown, large SQL queries, and SVG files. The design follows Claude’s text editor tool pattern, exposing view, str_replace, and insert primitives so other plugins can reuse a stricter editing workflow.
NVIDIA and Doosan Group Collaborate on Physical AI and AI Factory Infrastructure
NVIDIA Blog7 days agoBusiness
NVIDIA and Doosan Group are expanding their partnership across physical AI, robotics and AI factory infrastructure. The collaboration connects NVIDIA’s accelerated computing stack, DSX, MGX and physical AI tools with Doosan’s industrial automation, power generation and electronics materials capabilities. Key areas include smarter industrial robots, autonomous equipment, AI data center power systems and advanced PCB materials for high-performance servers and networking.
"Fully Hallucinated Operating System" Simulates an Entire OS via LLM Prompts
r/LocalLLaMA top day7 days agoCommentary
A popular Reddit post highlights a video demonstrating a "Fully Hallucinated Operating System" run entirely inside an LLM. By prompting the model to act as a terminal, it simulates file systems, network requests, and command execution purely through text generation. While impractical for production, this experiment showcases the impressive state-tracking and "world model" capabilities of modern LLMs.
club-3090 Adds Experimental FP8 Support for Qwen3.6-27B
r/LocalLLaMA top day7 days agoNew Tool
The open-source project club-3090 has rolled out experimental FP8 quantization support for Qwen3.6-27B. This update is highly anticipated by dual RTX 3090 users, allowing them to run the model with significantly reduced VRAM requirements. According to reports, the official Qwen3.6-27B-FP8 model performs virtually identically to the original unquantized BF16 version.
How much do amd64 microarchitecture levels help in Go?
Hacker News (AI keywords)7 days agoBenchmark
Daniel Lemire tests Go’s GOAMD64 levels using Roaring Bitmaps on a modern Intel Xeon. v2 brings strong gains where popcnt matters, while v3 adds further speedups in dense bitmap and set-operation workloads through AVX2. v4, despite implying AVX-512 support, shows no meaningful improvement in these benchmarks, likely due to current Go compiler limitations.
llama-server Router Mode: Pinned Model Grabs CUDA Context on All GPUs, Causing OOM
r/LocalLLaMA top day7 days agoCommentary
A Reddit user highlighted a limitation in llama-server's router mode (`--models-preset`): child processes spawn and initialize CUDA contexts on all available GPUs, even when pinned to a single card. When other GPUs are fully utilized by a large model, launching a smaller model fails with a CUDA OOM error because it cannot allocate the context stub on the maxed-out cards. Currently, child processes inherit the base environment, preventing per-model `CUDA_VISIBLE_DEVICES` configuration.
Is this the dawn of the Tokenpocalypse?
TechCrunch AI7 days agoBusiness
TechCrunch discusses Microsoft’s GitHub Copilot pricing changes as a sign that subsidized AI usage may be ending. As Anthropic and other major AI companies prepare for public-market scrutiny, profitability and usage-cost risks will become harder to ignore. The piece argues that higher prices, usage caps, and broader business-model changes may be necessary if AI labs want to survive beyond investor-subsidized growth.
Qwen 3.6 27B DeepSWE Benchmark Results Highlight Gap Between Local and Closed-Source Models
r/LocalLLaMA top day7 days agoBenchmark
A community benchmark of Qwen 3.6 27B on DeepSWE yielded a score of 1.79% (18/20th place), slightly outperforming Haiku 4.5. Run on a single RTX 6000 Blackwell GPU via vLLM with reasoning enabled, the test averaged 32 minutes and 44k output tokens per task. The author notes that while Qwen 3.6 27B represents a 'poor man's local SOTA,' the massive gap compared to frontier closed models suggests local LLMs are struggling to keep pace in complex coding.
Amazing Digital Dentures (a failed project)
Hugging Face Blog7 days agoCommentary
The post appears to discuss a project called “Amazing Digital Dentures,” explicitly framed as a failed project. Because the article body was not provided, the specific technical stack, models, tools, datasets, and reasons for failure cannot be verified. Based on the title and URL path, it may be a hackathon-style project retrospective focused on prototyping challenges and lessons learned.
Exploring 2-bit QAT: Can Ultra-Compressed Large Models Outperform 4-bit Models Half Their Size?
r/LocalLLaMA top day7 days agoCommentary
A popular Reddit thread on r/LocalLLaMA discusses the potential of 2-bit Quantization Aware Training (QAT) for large MoE models (120B to 400B). While current QAT efforts focus on 4-bit, users speculate whether a 2-bit QAT model could fit into consumer hardware (64GB/128GB RAM) and outperform a 4-bit model of half its size. This approach is proposed as a practical alternative to training ternary (1.58-bit) LLMs from scratch.
Mythograph Atelier #1 - Abstract Art That Means Something to You
Hugging Face Blog7 days agoCommentary
Only the title is available, so this summary is necessarily inferential. The post appears to be the first entry in a Mythograph Atelier series about abstract art that carries personal meaning. It may interest designers, creators, and AI art users exploring ways to turn memory, emotion, or symbolism into generative visual work.
MTP and QAT: What is the Relation? Running Gemma 4 31B in llama.cpp
r/LocalLLaMA top day7 days agoCommentary
A popular Reddit thread addresses user confusion over running Gemma 4 31B locally. It distinguishes between MTP (Multi-Token Prediction for inference speedup) and QAT (Quantization-Aware Training for preserving 4-bit quality). It also confirms that llama.cpp's new MTP support requires updated GGUF files and a secondary draft model file for acceleration.
If LLMs Have Human-Like Attributes, Then So Does Age of Empires II
Hacker News (AI keywords)7 days agoPaper
The paper argues that claims about LLMs having human-like attributes, such as morality or language understanding, can be methodologically fragile. By building and training a simple neural network on Age of Empires II, the author suggests such attributes may not be empirically unique to LLMs. The key recommendation is to define explicit measurement criteria and use a null assumption of LLM non-uniqueness before drawing anthropomorphic conclusions.
Building from Zero After Addiction, Prison, and a Felony
Hacker News (AI keywords)8 days agoCommentary
Gavin Ray recounts entering juvenile prison at 14, becoming a felon at 19, and losing stability to addiction. The essay follows his path back through software work, open source, Hasura, and people willing to judge him by future contribution rather than only past record. AI is not the focus; Claude Code is only mentioned as the tool used to generate the OpenGraph SVG image.

← PreviousPage 22Next →

Latest in AI

Do agents.md files help coding agents?

Community Discussion: Local Installation and Multilingual Training for Kokoro TTS

Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL

DeepSeek enters the fight for token volume, Anthropic dominates spend

SDSU Wired Its Dorms with 1,300 AI Cameras Without Telling Students

When GPUs Turn from Cost Burden into Profit Engine, Enterprise AI Enters a New Game

Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75

NVIDIA and LG Group Build an AI Factory for Physical AI, Mobility and AI Infrastructure★ 74

WWDC26 Countdown: Event Giveaway Revealed with Classic Dogcow Reference

Jensen Huang Signs Korea Deals with SK Hynix, NAVER and Doosan

Best Local TTS Solution

Algorithmic Monocultures in Hiring★ 78

DeepSeek V4 Pro beats GPT-5.5 Pro on precision

A Matter Wi-Fi Light Bulb in Rust on the Raspberry Pi Pico 2 W

User Shares Gemma 4 QAT Experience: Improved Quality and MTP Speedups

The Open Source Community is backing OpenEnv for Agentic RL

datasette-agent-edit 0.1a0

NVIDIA and Doosan Group Collaborate on Physical AI and AI Factory Infrastructure

"Fully Hallucinated Operating System" Simulates an Entire OS via LLM Prompts

club-3090 Adds Experimental FP8 Support for Qwen3.6-27B

How much do amd64 microarchitecture levels help in Go?

llama-server Router Mode: Pinned Model Grabs CUDA Context on All GPUs, Causing OOM

Is this the dawn of the Tokenpocalypse?

Qwen 3.6 27B DeepSWE Benchmark Results Highlight Gap Between Local and Closed-Source Models

Amazing Digital Dentures (a failed project)

Exploring 2-bit QAT: Can Ultra-Compressed Large Models Outperform 4-bit Models Half Their Size?

Mythograph Atelier #1 - Abstract Art That Means Something to You

MTP and QAT: What is the Relation? Running Gemma 4 31B in llama.cpp

If LLMs Have Human-Like Attributes, Then So Does Age of Empires II

Building from Zero After Addiction, Prison, and a Felony