Latest in AI

Showing:ResearchersClear ×

🔥 Trending today

anthropic4 open-source3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization
r/LocalLLaMA top day5 days agoPaper
OSCAR applies offline-precomputed rotation matrices—derived from spectral covariance analysis—to reshape KV tensor distributions before 2-bit quantization, suppressing outliers and reducing rounding error. The rotation adds negligible inference overhead since it requires no runtime learning. GGUF downloads for Gemma-4-12B-it, Qwen3-32B, and Qwen3-4B-Thinking are available, with llama.cpp and sglang integrations and an arXiv paper.
SCAIL-2: Open-Source End-to-End Character Animation Without Intermediate Pose Representations
r/LocalLLaMA top day5 days agoRelease
SCAIL-2 by zai-org removes the reliance on skeleton maps and inpainting masks common in prior character animation pipelines, driving characters directly from video in an end-to-end manner. Trained on 60K synthesized motion pairs using SCAIL-Preview, Wan-Animate, and MoCha via a Unified Motion Transfer Interface with RoPE design, the model develops emergent abilities beyond its teacher models. Capabilities include cross-identity character replacement, animal-driving scenarios, and zero-shot support for SAM3D-Body mesh rendering.
GPT-2: Too Dangerous To Release — A 2019 Retrospective
Hacker News (AI keywords)5 days agoCommentary
In 2019, OpenAI staged the release of GPT-2, citing fears it could enable large-scale disinformation and spam generation. The move sparked debate: was it responsible AI safety practice or a savvy PR stunt? Written in late 2022, this blog post revisits the episode now that GPT-2 looks quaint compared to GPT-3/4, asking whether the original fears were justified.
Releasing Cohere North Mini Code
r/LocalLLaMA top day5 days agoRelease
Cohere’s Jay Alammar announced the official release of North Mini Code after early community feedback from r/LocalLLaMA. Weights are available on Hugging Face, including an fp8 version, and the model can be tried for free through OpenCode. For vLLM deployment, Cohere recommends using vLLM main for now and installing cohere_melody for accurate response parsing, while noting community requests for quantization and llama.cpp support.
Where is the AI Jobs Crisis?
Hacker News (AI keywords)5 days agoCommentary
Apollo Wealth's Daily Spark column revisits the AI jobs crisis narrative from an institutional investment perspective. Despite widespread enterprise adoption of generative AI tools, major labor markets have not shown the structural collapse many analysts predicted. The piece implies AI's employment impact may be slower, more uneven, or manifesting differently than the classic automation-displacement model suggests.
Watch agents fight: a live challenge to speed up Gemma 4 E4B inference on a single A10G
r/LocalLLaMA top day5 days agoBenchmark
A public HuggingFace Spaces dashboard hosts a live competition where AI agents race to optimize Gemma 4 E4B inference throughput on a single NVIDIA A10G GPU. The challenge gamifies ML inference engineering, letting anyone watch agents explore quantization and scheduling strategies in real time. Optimization recipes surfaced by the competition offer practical value for developers targeting single-GPU self-hosted Gemma 4 deployments.
What it feels like to work with Mythos
One Useful Thing (Mollick)5 days agoCommentary
Ethan Mollick of One Useful Thing shares his personal experience working with Mythos, a project tied to Claude Fable. His central claim is that Claude Fable represents another significant, qualitative leap in AI capability rather than an incremental update. Writing from a knowledge-worker perspective rather than a purely technical one, Mollick's assessment serves as an early signal for practitioners evaluating whether this model meaningfully changes how they work.
Anthropic Releases Claude Fable 5, Its First Mythos-Class Model★ 78
The Verge AI5 days agoRelease
Anthropic has released Claude Fable 5, the company's most powerful model ever made widely available and its first under the new 'Mythos' model class. The model shows exceptional performance across software engineering, knowledge work, and vision tasks. Its advantage over competing models reportedly grows wider as tasks increase in length and complexity, making it particularly suited for demanding, multi-step workloads.
Anthropic Releases Claude Fable 5, Its First Public Mythos-Class Model, With Guardrails for High-Risk Domains★ 76
TechCrunch AI5 days agoRelease
Anthropic has released Claude Fable 5, marking the first time a model from its high-capability Mythos family is available to the general public. The model includes built-in guardrails that restrict responses in high-risk domains such as cybersecurity and biology to mitigate misuse potential. The launch comes just days after Anthropic publicly warned that AI technology is becoming increasingly and alarmingly dangerous.
System Card: Claude Fable 5 and Claude Mythos 5★ 82
Hacker News (AI keywords)5 days agoRelease
Anthropic has published system cards for its two newest flagship models, Claude Fable 5 and Claude Mythos 5, following its standard responsible-release practice. These documents cover dangerous capability evaluations, ASL safety-level determinations, red-teaming results, and alignment assessments under the company's Responsible Scaling Policy. They serve as primary references for safety researchers, enterprise buyers, regulators, and developers assessing model risk and deployment suitability.
Anthropic Launches Claude Fable 5★ 85
Hacker News (AI keywords)5 days agoRelease
Anthropic announced Claude Fable 5 on June 9, 2026, marking a new naming generation beyond the Claude 4.X family. The announcement URL also references 'Mythos 5,' suggesting a companion model may be included in this release. With model ID claude-fable-5, this is Anthropic's most current model and relevant to developers, researchers, and enterprise users integrating Claude APIs.
Cohere North Mini Code 1.0
r/LocalLLaMA top day5 days agoRelease
CohereLabs’ North Mini Code 1.0 appears to have moved from early access to final release, with weights available on Hugging Face. The Reddit post describes it as a 30B A3B coding model. Its Artificial Analysis overall score of 28 trails Qwen 3.6 35B at 43, but its coding index score of 33 is close to Qwen’s 35 and above Gemma 4 26B’s 22.
Apple Embraces AI Photo Editing at WWDC 2026, Reversing Its Caution Over Distorting Reality
The Verge AI5 days agoNew Tool
Apple, once skeptical of generative AI photo editing over reality-distortion concerns, unveiled a suite of AI image manipulation tools at WWDC 2026. The move marks a fundamental strategic shift, putting Apple on par with Google Photos and Samsung, which have offered similar features for years. The new tools—expected in iOS 27—will give users effortless image manipulation capabilities, reigniting debates around deepfakes and photo authenticity.
Unsloth Gemma 4 QAT MTP assistant models now available
r/LocalLLaMA top day5 days agoRelease
A r/LocalLLaMA post notes that Unsloth’s Gemma 4 QAT MTP assistant models are now available in GGUF format. The root directories include q8_0 files named mtp-gemma-4-*.gguf, while MTP folders contain q8_0 and larger quantized variants. The listed releases cover 12B, 26B-A4B, 31B, E2B, E2B mobile, E4B, and E4B mobile it-qat-GGUF repositories.
TTS Benchmark Revamped with Objective Standards and Blind ELO Voting (46 Models)
r/LocalLLaMA top day5 days agoBenchmark
Reddit user UkieTechie has revamped their TTS benchmark platform with objective scoring standards and live blind voting, now covering 46 speech synthesis models. Hosted on Hugging Face Space, the arena lets users vote on audio quality without knowing the model name, generating a dynamic ELO leaderboard. The project is open-source on GitHub and welcomes community submissions of new models.
'Sloppenheimer:' Amazon Employees Mock the Company's AI on Slack
Hacker News (AI keywords)5 days agoCommentary
Amazon employees have been using the term 'Sloppenheimer'—a portmanteau of 'slop' and 'Oppenheimer'—to mock their company's AI products on internal Slack channels. The incident highlights a stark gap between Amazon's aggressive public AI messaging and internal employee skepticism about actual output quality. It reflects a broader industry backlash against AI-generated low-quality content across major tech platforms.
Introducing North Mini Code: Cohere's First Model For Developers
Hugging Face Blog5 days agoRelease
Cohere officially introduces North Mini Code, the first model in its North lineup explicitly aimed at developers rather than enterprise API customers. The 'Mini' designation signals a compact, cost-efficient model suited for IDE integrations, CLI tools, and real-time code completion. This marks a strategic expansion for Cohere beyond B2B into the broader developer tooling ecosystem.
Judge Learns Both Sides Used AI, Cancels Trial, Kicks Everyone Off the Case
Hacker News (AI keywords)5 days agoIncident
In a rare legal incident, a judge found that attorneys on both sides of a case had used AI tools in their legal work. The judge responded by canceling the trial entirely and dismissing all lawyers involved. The case highlights growing judicial frustration with unchecked AI use in court filings and the serious professional consequences that can follow.
FCC Wants to Kill Burner Phones by Forcing Telecoms to Verify All Customers' IDs
Hacker News (AI keywords)5 days agoRegulation
The FCC is proposing rules that would require telecom carriers to verify the identity of every customer before activating service. This move would eliminate anonymous prepaid 'burner phones,' long used by journalists, domestic abuse survivors, and privacy-conscious individuals. Critics warn the policy could undermine digital privacy and disproportionately harm vulnerable populations, while proponents argue it would curb fraud and criminal activity.
Can LLMs Beat Classical Hyperparameter Optimization Algorithms?
Hacker News (AI keywords)5 days agoBenchmark
This paper investigates whether LLMs can serve as effective hyperparameter optimization (HPO) agents, competing with established classical methods such as Bayesian optimization, TPE, and random search. The study likely employs a systematic evaluation framework where LLMs iteratively suggest hyperparameter configurations based on task descriptions and historical evaluation results. Findings aim to clarify the practical potential and limitations of LLMs in AutoML pipelines.
Build a Basic AI Agent from Scratch: Long Task Planning
Hacker News (AI keywords)5 days agoTutorial
This source appears to be a tutorial about constructing a basic AI agent from scratch. Based only on the title, its focus is likely long-task planning: how an agent breaks a larger objective into steps and works through them over time. No article body was provided, so specific implementation choices, model providers, tools, code examples, or evaluation results cannot be confirmed.
Single-slot half-height PCIe V100 with NVLink appears in China
r/LocalLLaMA top day5 days agoHardware
A r/LocalLLaMA post says a Bilibili creator has shown a single-slot, half-height PCIe V100 with NVLink on a custom PCB. The card is described as 16 cm long, passively cooled by default, capped at 75W, with another version supporting up to 300W. The 16GB model is expected around or below ¥1500, with a 32GB version reportedly planned, but it is not yet available for purchase.
Rick & Morty
r/LocalLLaMA top day5 days agoCommentary
This r/LocalLLaMA top-day post is a short image meme titled “Rick & Morty.” The only accompanying text says, “nobody expected HF there,” suggesting surprise at HF appearing in the image’s context. There are no technical claims, model details, releases, or benchmarks, so its value is mainly as a small signal of community culture around Hugging Face / HF and local LLM discussions.
Google Introduces Gemma 4 12B: A Unified, Encoder-Free Multimodal Model★ 85
Google DeepMind Blog5 days agoRelease
Google DeepMind has unveiled Gemma 4 12B, a next-generation open-weights model featuring a unified, encoder-free multimodal architecture. By eliminating the traditional separate vision encoder (such as ViT), it processes diverse modalities directly within a single Transformer network. This design simplifies training, reduces inference latency, and enhances cross-modal alignment, marking a significant milestone for open-source AI.
PR-CAD: Progressive Refinement for Text-to-CAD Generation with LLMs
Hacker News (AI keywords)5 days agoPaper
This arXiv paper introduces PR-CAD, a framework for controllable and faithful text-to-CAD generation with large language models. It treats CAD creation and editing as one progressive refinement process rather than separate tasks. The authors curate an interaction dataset and report state-of-the-art controllability and faithfulness on public benchmarks.
Google DeepMind Launches Initiative to Power the Future of Robotics in Europe★ 70
Google DeepMind Blog5 days agoBusiness
Google DeepMind has unveiled a strategic initiative to power the future of robotics in Europe. The program focuses on advancing Embodied AI and physical AI through deep collaborations with European academic institutions and industry partners. By combining DeepMind's AI expertise with Europe's strong engineering foundation, the initiative aims to accelerate breakthroughs in robotic generalization and safety.
PSA: Throttle GPU Power Limits for Major Energy Savings with Minimal Inference Performance Loss
r/LocalLLaMA top day5 days agoHardware
A Reddit user reminds the local LLM community that throttling GPU power limits offers outsized energy savings with minimal performance cost. On dual Radeon VII cards, cutting power from 250W to 100W per card resulted in less than 10% drop in inference speed. LLM inference is memory-bound rather than compute-bound, making it uniquely tolerant of reduced GPU clock speeds compared to training or rendering tasks.
Apple Announced a New On-Device Inference Engine for Apple Silicon
r/LocalLLaMA top day5 days agoRelease
Apple announced CoreAI at WWDC, which the post frames as a possible future replacement for CoreML and an alternative to MLX, llama.cpp, and torch for optimized on-device inference. Models still need conversion through Python scripts, and current supported models appear mostly from mid-2025. No performance data is available yet; the author expects it may trail MLX on GPU, but Apple’s 20B on-device foundation model claim suggests larger app-bundled models could become possible.
Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
Hacker News (AI keywords)5 days agoPaper
Echoing the famous Transformer paper, this work asks whether grep alone is sufficient for agentic search scenarios. The study focuses on 'agent harnesses'—the scaffolding wrapping an LLM, including prompting strategy, tool access, and memory—as the primary driver of search quality. Findings suggest harness design may matter more than the underlying model, challenging the community's focus on model scaling.
Rust-native CPU-only LFM2.5-8B-A1B inference library "bebelm" published as cargo crate
r/LocalLLaMA top day5 days agoNew Tool
Community developer maximecb has published bebelm, a Rust-native, GPU-free inference implementation of Liquid AI's LFM2.5-8B-A1B model, available on crates.io. Decode speed reaches ~37 tokens/s on a Ryzen 7950x with ~7GB memory footprint; prefill is unoptimized and currently similar in speed to decode. The library supports tool-use callbacks, weight sharing across multiple Agent instances with independent KV caches, and Agent cloning to skip repeated prefill on shared prompts.

← PreviousPage 6Next →

Latest in AI

OSCAR RotationZoo - Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization

SCAIL-2: Open-Source End-to-End Character Animation Without Intermediate Pose Representations

GPT-2: Too Dangerous To Release — A 2019 Retrospective

Releasing Cohere North Mini Code

Where is the AI Jobs Crisis?

Watch agents fight: a live challenge to speed up Gemma 4 E4B inference on a single A10G

What it feels like to work with Mythos

Anthropic Releases Claude Fable 5, Its First Mythos-Class Model★ 78

Anthropic Releases Claude Fable 5, Its First Public Mythos-Class Model, With Guardrails for High-Risk Domains★ 76

System Card: Claude Fable 5 and Claude Mythos 5★ 82

Anthropic Launches Claude Fable 5★ 85

Cohere North Mini Code 1.0

Apple Embraces AI Photo Editing at WWDC 2026, Reversing Its Caution Over Distorting Reality

Unsloth Gemma 4 QAT MTP assistant models now available

TTS Benchmark Revamped with Objective Standards and Blind ELO Voting (46 Models)

'Sloppenheimer:' Amazon Employees Mock the Company's AI on Slack

Introducing North Mini Code: Cohere's First Model For Developers

Judge Learns Both Sides Used AI, Cancels Trial, Kicks Everyone Off the Case

FCC Wants to Kill Burner Phones by Forcing Telecoms to Verify All Customers' IDs

Can LLMs Beat Classical Hyperparameter Optimization Algorithms?

Build a Basic AI Agent from Scratch: Long Task Planning

Single-slot half-height PCIe V100 with NVLink appears in China

Rick & Morty

Google Introduces Gemma 4 12B: A Unified, Encoder-Free Multimodal Model★ 85

PR-CAD: Progressive Refinement for Text-to-CAD Generation with LLMs

Google DeepMind Launches Initiative to Power the Future of Robotics in Europe★ 70

PSA: Throttle GPU Power Limits for Major Energy Savings with Minimal Inference Performance Loss

Apple Announced a New On-Device Inference Engine for Apple Silicon

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Rust-native CPU-only LFM2.5-8B-A1B inference library "bebelm" published as cargo crate