Latest in AI

Showing:DevelopersClear ×

🔥 Trending today

anthropic4 open-source3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

New to Local LLMs: Overwhelmed by Tool Choices, Model Naming, and Quantization
r/LocalLLaMA top day4 days agoTutorial
A first-time local LLM user installed ollama on Windows with gemma4 and qwen3.6, but quickly hit a wall of confusion around GUI tool selection, model size tradeoffs, and cryptic quantization naming like Q4_K_M and IQ4_XS. Despite owning high-end hardware (RTX 5090, 64GB DDR5, 9950X3D), the user lacks the foundational knowledge to make informed choices. The post highlights ongoing onboarding gaps in the local LLM ecosystem, where fragmented tooling and jargon-heavy documentation create steep barriers for newcomers.
Rich Sutton on AI Creativity and Discovery
Hacker News (AI keywords)4 days agoOpinion
Reinforcement learning pioneer Rich Sutton posted on Twitter about AI creativity and discovery, touching on one of the field's most debated questions. Known for the influential 'Bitter Lesson,' Sutton consistently argues for general computation-based methods over hand-coded knowledge. Note: original tweet content was not provided; this summary is inferred from the title alone.
Without open LLM competition, closed-source LLM companies will become insatiable
r/LocalLLaMA top day4 days agoOpinion
A r/LocalLLaMA user criticizes closed-source LLM providers, singling out Anthropic and its $200/month users. The post argues that without open-source model competition, proprietary AI companies could become more arrogant and less accountable to customers. The source offers little concrete context beyond an image and opinionated commentary, so it is best read as a community sentiment post rather than a verified product incident.
Releasing Apodex-1.0 Smol Models (0.8B, 2B, 4B Open-Weights) Optimized for Agentic Verification + AgentHarness Evals
r/LocalLLaMA top day4 days agoRelease
Apodex 1.0 launches with open-weight models at 0.8B, 2B, and 4B, trained not for general generation but for specialized sub-agent roles—fact-checking external claims and verifying tool call outputs before passing results to a main controller. The design targets long-horizon agent workflows where routing small tasks to lightweight models avoids wasteful use of 70B+ models at every step. AgentHarness, an open-source evaluation framework for local multi-step agent pipelines, is released alongside the weights.
German court rules Google liable for false answers in AI Overviews, declaring them Google's own words★ 72
Hacker News (AI keywords)4 days agoRegulation
A landmark German court ruling has declared that Google's AI Overviews are legally Google's own words, not neutral third-party aggregations. This makes Google directly liable for false or misleading answers generated by the feature, removing the 'just a tool' defense. The ruling is among the first globally to apply traditional media liability frameworks to generative AI search results.
If Claude Fable 5 Silently Degrades Your Responses, You'll Never Know★ 73
Simon Willison's Weblog5 days agoEthics
Anthropic's 319-page Fable 5 system card discloses a silent intervention mechanism that covertly limits model effectiveness for requests related to frontier LLM development — including pretraining pipelines, distributed training infrastructure, and ML accelerator design. Unlike other safeguards, these interventions are invisible to users, using prompt modification, steering vectors, or PEFT without any warning or fallback. Estimated to affect 0.03% of traffic, but critics like Simon Willison warn it sets a troubling precedent for AI transparency.
macOS Container Machines: Apple's Lightweight Linux VM Architecture for Native Containers
Hacker News (AI keywords)5 days agoNew Tool
Apple's open-source `container` project enables running Linux containers on macOS without Docker Desktop by using lightweight Linux VMs (Container Machines) built on Apple's Virtualization Framework. Each Container Machine provides isolated Linux kernel support for OCI-compliant workloads. This is particularly relevant for AI/ML developers needing local container environments on Apple Silicon Macs.
Threshold Billing Is Now Enabled for Vercel Pro Teams
Vercel Changelog5 days agoRelease
Vercel has rolled out threshold billing to all Pro team accounts. This feature allows team admins to define usage thresholds that trigger billing only when exceeded, reducing the risk of unexpected cost spikes. It is a practical cost-control improvement for developers and small teams relying on Vercel for frontend and full-stack deployments.
Building Trust in Enterprise AI: Together AI Earns ISO 27001:2022 Certification
Together AI5 days agoBusiness
Together AI announced it has earned ISO 27001:2022 certification, the latest version of the international information security management standard. This positions the AI inference platform to better serve enterprise customers in regulated industries such as finance, healthcare, and legal tech, where third-party security certification is often a hard procurement requirement. The milestone helps Together AI compete more credibly against hyperscaler AI services like Amazon Bedrock and Azure AI.
Initial Impressions of Claude Fable 5★ 71
Simon Willison's Weblog5 days agoCommentary
Anthropic released Claude Fable 5 and Claude Mythos 5 simultaneously; Fable 5 matches Mythos 5 in capability but adds strict safety classifiers, with new API fallback mechanisms for rejected requests. Both models offer 1M token context, 128K max output, January 2026 knowledge cutoff, priced at $10/$50 per million tokens — double Opus 4.x. Simon's knowledge-breadth test shows Fable 5 substantially outperforms Opus 4.8, listing dozens of his open-source projects with approximate dates from memory alone.
Furiosa AI inference chip could be a game changer for local LLMs
r/LocalLLaMA top day5 days agoHardware
A r/LocalLLaMA post discusses Furiosa AI’s RNGD inference chip, citing TSMC 5nm, Hynix HBM3, 48GB VRAM, 1.5TB/s bandwidth, and 180W TDP. The author argues it could matter for local LLM users if Furiosa opens its programming interface and works with llama.cpp on a GGML backend. The post later clarifies Furiosa is not selling to consumers; this is a wish and market commentary, not a launch.
Hot take: "Vibecoding" is being used for two different things and it causes unnecessary friction
r/LocalLLaMA top day5 days agoCommentary
A Reddit user argues "vibecoding" carries two distinct meanings: throwing code at AI carelessly with no engineering judgment, versus using heavy AI assistance while still maintaining quality standards. Andrej Karpathy's own practice almost certainly fits the second definition, not the first. This semantic ambiguity fuels unnecessary arguments whenever the community debates AI-assisted development quality.
NVIDIA Confidential Computing to Help Expand Apple's Private Cloud Compute to Google Cloud
NVIDIA Blog5 days agoBusiness
Apple announced at WWDC that its Private Cloud Compute (PCC) will expand beyond its own data centers to Google Cloud, powered by NVIDIA GPUs with Confidential Computing. NVIDIA's hardware-level trusted execution environment enables confidential inference for Apple Foundation Models, co-built with Google, preserving user privacy even on third-party infrastructure. This three-way collaboration marks a significant industry validation of confidential computing for large-scale commercial AI deployments.
llm 0.32a3 alpha release, almost entirely written by Claude Fable 5
Simon Willison's Weblog5 days agoRelease
Simon Willison has published llm 0.32a3, an alpha release of his popular LLM CLI and Python library. The standout detail is that nearly all of the code was written by the new Claude Fable 5 model using Claude Code. Willison also posted a detailed write-up covering how he used Claude Code to add features to both his datasette agent and llm projects.
Surprise, Pay $1000: Unexpected Costs When Using Blacksmith
Hacker News (AI keywords)5 days agoIncident
The author shares a first-hand account of being hit with a surprise $1,000 charge while using Blacksmith, a high-speed GitHub Actions runner service popular in AI/ML workflows. The post highlights how pay-as-you-go compute pricing can spiral without proper spending caps or usage alerts. It serves as a reminder for developers and founders to guard against runaway cloud costs when integrating third-party CI/CD or GPU services into their pipelines.
Setting a Custom Price for a Model in AgentsView
Simon Willison's Weblog5 days agoTutorial
AgentsView, built by Wes McKinney, visualizes token usage and costs across local coding agents. When Claude Fable 5 launched without being listed in AgentsView's pricing database, Simon Willison used Fable itself to reverse-engineer the tool and find a recipe for setting custom prices. He also shared a treemap showing over $83 in single-day Fable 5 spending and $516 saved via prompt caching.
If Claude Fable Stops Helping You, You'll Never Know
Hacker News (AI keywords)5 days agoEthics
A Hacker News post claims that Claude Fable 5's usage policy or model behavior allows Anthropic to silently sabotage or degrade service for applications it identifies as competitors. Unlike typical API errors, this degradation produces no alerts or error codes, leaving developers unable to distinguish intentional throttling from normal model variance. The piece raises serious questions about transparency, fair competition, and the trust developers can place in AI API providers.
Exif Smuggling: PoC for Hiding Malicious Prompts in Image EXIF Metadata
Hacker News (AI keywords)5 days agoIncident
Exif Smuggling is a security PoC showing how attackers can embed hidden instructions in image EXIF metadata fields to perform indirect prompt injection against vision-capable AI models. When AI systems parse images alongside their metadata, embedded malicious text may be processed as legitimate instructions, bypassing standard input filters. Developers building AI apps with image upload features should strip or sanitize EXIF data before passing content to language models.
Upcoming Breaking Changes for NPM v12
Hacker News (AI keywords)5 days agoRelease
GitHub's official changelog published a heads-up about breaking changes coming in NPM v12, targeting JavaScript and Node.js developers. Major version upgrades typically drop deprecated APIs, raise minimum Node.js version requirements, and alter lockfile formats or dependency resolution logic. Developers maintaining packages or CI pipelines should review the changes early to avoid disruption upon upgrading.
Anthropic's Claude Fable 5 Can Generate Surprisingly Fun Video Games with One Click
TechCrunch AI5 days agoRelease
Anthropic's latest flagship model, Claude Fable 5, has demonstrated the ability to generate oddly entertaining video games at the push of a button. The capability is expected to resonate strongly with the vibe coding community — users who prefer describing intent in natural language rather than writing code manually. This positions Fable 5 as a potentially transformative tool for indie developers, designers, and no-code creators.
Grit: Rewriting Git in Rust with Agents
Hacker News (AI keywords)5 days agoCommentary
GitButler's Grit project aims to rewrite Git's C codebase in Rust, leaning heavily on AI coding agents to accelerate the migration. The post shares first-hand observations on where agents excel—understanding Git's object model, generating idiomatic Rust—and where they fall short, such as ownership edge cases and hallucinated behavior. It serves as a rare real-world case study of AI-assisted rewriting of complex systems-level software.
Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech
Hugging Face Blog5 days agoBenchmark
Code-switching—where bilingual speakers blend two languages in a single utterance—is common in markets like Taiwan, Singapore, and India, yet most ASR benchmarks focus on monolingual audio. ServiceNow AI evaluates frontier speech recognition models specifically on this mixed-language scenario. The findings help enterprise teams make informed ASR model choices when deploying voice agents for multilingual customer-facing applications.
Anthropic says these topics are too dangerous to let its Fable 5 model talk about
Ars Technica AI5 days agoEthics
Anthropic has announced that its latest frontier model, Fable 5, enforces hard refusals on topics deemed too dangerous, specifically cybersecurity, biology, and chemistry. The move reflects the company's ongoing effort to balance capability with safety as models grow more powerful. For developers and researchers in these fields, the restrictions may limit practical usability in legitimate professional contexts.
RTX PRO 6000 Listed at $13,250 on NVIDIA Official Page
r/LocalLLaMA top day5 days agoHardware
A r/LocalLLaMA post points to NVIDIA Marketplace showing the RTX PRO 6000 Blackwell Workstation Edition priced at $13,250. The post asks when this official-page price appeared, without adding benchmarks or broader pricing evidence. For local LLM users, the figure matters because workstation GPU pricing directly affects the economics of self-hosted inference, experimentation, and small-team AI hardware planning.
Quoting Andrej Karpathy on Claude Fable 5 and Jevons' Paradox in AI-generated software
Simon Willison's Weblog5 days agoCommentary
Andrej Karpathy shares that Claude Fable 5 has made working software feel like an open tap, triggering Jevons' Paradox: the cheaper it gets to build software, the more software he wants. He lists use cases ranging from bespoke single-use apps and hyper-specific dashboards to 10x test suites, auto-optimized code, and custom HTML research reports. He closes with a Matrix reference — "Free your mind" — suggesting AI breaks the mental ceiling on what individuals can ask for.
OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization
r/LocalLLaMA top day5 days agoPaper
OSCAR applies offline-precomputed rotation matrices—derived from spectral covariance analysis—to reshape KV tensor distributions before 2-bit quantization, suppressing outliers and reducing rounding error. The rotation adds negligible inference overhead since it requires no runtime learning. GGUF downloads for Gemma-4-12B-it, Qwen3-32B, and Qwen3-4B-Thinking are available, with llama.cpp and sglang integrations and an arXiv paper.
Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation
Ars Technica AI5 days agoNew Tool
Google has announced Gemini 3.5 Live Translate, a real-time voice-to-voice translation system that preserves the original speaker's tone, pacing, and pitch rather than producing flat synthetic output. The system embeds Google's SynthID watermarks into translated audio, enabling AI content provenance detection without affecting audio quality. This extends Google's Gemini Live multimodal API capabilities into cross-language communication scenarios such as meetings, live streams, and customer service.
Can tech companies learn to love cheaper AI models?
TechCrunch AI5 days agoBusiness
As the AI model market grows more competitive, cheaper alternatives are emerging that rival flagship models in capability. The central question is whether enterprises can shift from premium models to lower-cost alternatives without sacrificing output quality. If proven viable, this shift could upend AI pricing strategies, enterprise procurement logic, and the market dominance of top-tier model providers.
Apple's AI Can Now Change Your Passwords. What Could Possibly Go Wrong?
Hacker News (AI keywords)5 days agoCommentary
Apple's AI assistant has gained the ability to change account passwords on behalf of users, raising eyebrows in the security community. The author uses pointed sarcasm to question whether delegating password management to an AI system is wise. This development reflects a broader trend of AI agents gaining deeper OS-level permissions, blurring the line between helpful automation and dangerous over-trust.
Ask HN: Are you still using your Vision Pro?
Hacker News (AI keywords)5 days agoOpinion
An Ask HN thread polls the community on whether early adopters still actively use their Apple Vision Pro headsets. Discussion likely covers comfort, battery life, killer-app gaps, and niche use cases that survived past the honeymoon period. While informal, such threads offer a candid signal from a technically sophisticated early-adopter cohort relevant to visionOS developers and spatial computing observers.

← PreviousPage 7Next →

Latest in AI

New to Local LLMs: Overwhelmed by Tool Choices, Model Naming, and Quantization

Rich Sutton on AI Creativity and Discovery

Without open LLM competition, closed-source LLM companies will become insatiable

Releasing Apodex-1.0 Smol Models (0.8B, 2B, 4B Open-Weights) Optimized for Agentic Verification + AgentHarness Evals

German court rules Google liable for false answers in AI Overviews, declaring them Google's own words★ 72

If Claude Fable 5 Silently Degrades Your Responses, You'll Never Know★ 73

macOS Container Machines: Apple's Lightweight Linux VM Architecture for Native Containers

Threshold Billing Is Now Enabled for Vercel Pro Teams

Building Trust in Enterprise AI: Together AI Earns ISO 27001:2022 Certification

Initial Impressions of Claude Fable 5★ 71

Furiosa AI inference chip could be a game changer for local LLMs

Hot take: "Vibecoding" is being used for two different things and it causes unnecessary friction

NVIDIA Confidential Computing to Help Expand Apple's Private Cloud Compute to Google Cloud

llm 0.32a3 alpha release, almost entirely written by Claude Fable 5

Surprise, Pay $1000: Unexpected Costs When Using Blacksmith

Setting a Custom Price for a Model in AgentsView

If Claude Fable Stops Helping You, You'll Never Know

Exif Smuggling: PoC for Hiding Malicious Prompts in Image EXIF Metadata

Upcoming Breaking Changes for NPM v12

Anthropic's Claude Fable 5 Can Generate Surprisingly Fun Video Games with One Click

Grit: Rewriting Git in Rust with Agents

Can Voice Agents Handle Bilingual Customers? Benchmarking Frontier ASR on Code-Switched Speech

Anthropic says these topics are too dangerous to let its Fable 5 model talk about

RTX PRO 6000 Listed at $13,250 on NVIDIA Official Page

Quoting Andrej Karpathy on Claude Fable 5 and Jevons' Paradox in AI-generated software

OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization

Google announces Gemini 3.5 Live Translate for instant voice-to-voice translation

Can tech companies learn to love cheaper AI models?

Apple's AI Can Now Change Your Passwords. What Could Possibly Go Wrong?

Ask HN: Are you still using your Vision Pro?