Latest in AI

Showing:DevelopersOtherClear ×

🔥 Trending today

anthropic4 open-source3 amazon3 ai-regulation2 government-policy2 export-controls2 geopolitics2 privacy2 python-packaging2 webassembly2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search
Hacker News (AI keywords)5 days agoPaper
Echoing the famous Transformer paper, this work asks whether grep alone is sufficient for agentic search scenarios. The study focuses on 'agent harnesses'—the scaffolding wrapping an LLM, including prompting strategy, tool access, and memory—as the primary driver of search quality. Findings suggest harness design may matter more than the underlying model, challenging the community's focus on model scaling.
Rust-native CPU-only LFM2.5-8B-A1B inference library "bebelm" published as cargo crate
r/LocalLLaMA top day5 days agoNew Tool
Community developer maximecb has published bebelm, a Rust-native, GPU-free inference implementation of Liquid AI's LFM2.5-8B-A1B model, available on crates.io. Decode speed reaches ~37 tokens/s on a Ryzen 7950x with ~7GB memory footprint; prefill is unoptimized and currently similar in speed to decode. The library supports tool-use callbacks, weight sharing across multiple Agent instances with independent KV caches, and Agent cloning to skip repeated prefill on shared prompts.
Apple Says Its AI is Still Private Even When Running on Google's Servers★ 75
Ars Technica AI5 days agoEthics
Apple clarified that running some of its AI models on Google's cloud infrastructure does not compromise user privacy. Through its Private Cloud Compute (PCC) architecture, Apple ensures that all data is processed in secure enclaves with end-to-end encryption. Consequently, Google has zero access to user data, addressing privacy concerns over Apple's cloud partnerships.
Lovable Hits $500M Annualized Revenue with 1 Million New Projects Weekly★ 82
TechCrunch AI5 days agoBusiness
AI software development platform Lovable has surpassed $500 million in annualized run-rate revenue (ARR). The company reports that users are now launching over 1 million new projects per week on the platform. This rapid growth highlights a major shift, with users increasingly leveraging AI to build full-scale businesses and replace legacy internal software.
Jetson Orin NX Build for Hermes Agent + Benchmarking
r/LocalLLaMA top day5 days agoHardware
The post describes turning an unused Jetson Orin NX into a compact local LLM server for Hermes Agent testing. The goals were low noise, over 10 tok/s generation, 300 tok/s prompt processing, at least 65K context, and a custom case. After testing Gemma 4, Qwen 3.6, and many quant variants, the author reports Gemma 4 26B A4B UD Q2_K_XL reaching 66K context and 10.21 tok/s near 60K context.
NeuroBait: I fine-tuned a model to spark dopamine for ADHD brain
Hugging Face Blog5 days agoNew Tool
NeuroBait is a Hugging Face community project built to help with ADHD task-initiation freeze rather than diagnosis or to-do planning. It fine-tunes google/gemma-3-12b-it with LoRA to produce short, warm, context-aware nudges. The project uses Unsloth and Modal for training, then deploys on a Hugging Face Space with Gradio, transformers, peft, and a runtime LoRA adapter.
Amap Releases ABot-Earth 0.5: Shifting from 2D Distillation to 3D Native for Consistent Scene Generation★ 70
量子位 QbitAI5 days agoRelease
Amap has released ABot-Earth 0.5, its latest spatial intelligence model. Moving beyond traditional 2D distillation methods (like Score Distillation Sampling), the model adopts a 3D native driving architecture. This breakthrough addresses multi-view inconsistency and distortion, enabling highly consistent 3D scene generation for autonomous driving simulation, smart cities, and digital twin mapping.
Is a New Player Joining China’s Top-Tier General AI Models?
量子位 QbitAI5 days agoCommentary
Based only on the title, the article likely examines China’s domestic general-purpose AI model landscape and asks whether a new company or model is entering the top tier. It appears to be an industry observation rather than a technical paper or tutorial. Without the full text, the specific model, company, benchmark evidence, and business context cannot be verified.
A 4B Edge-Deployable Cognitive Model Built in China
量子位 QbitAI5 days agoRelease
QbitAI’s headline says a domestic Chinese team has built a 4B-parameter “cognitive model” suitable for edge deployment. The framing links it to a model direction previously associated with Andrej Karpathy. Since the article body was not provided, details such as the model name, architecture, benchmark results, hardware requirements, open-source status, and licensing remain unverified.
Xiaohongshu Is Growing a GitHub for AI Skills
量子位 QbitAI5 days agoNew Tool
QbitAI reports that Xiaohongshu is testing RED Skill, letting creators attach AI Skills directly under posts. Users can open a Skill page and copy it into assistants such as Codex, Claude Code, or OpenClaw. Nearly 1,000 original Skills have appeared during testing, spanning PPTs, interviews, papers, fitness, travel, and lifestyle use cases, with broader creator rollout expected in July.
Voice AI for Greece
ElevenLabs Blog5 days agoBusiness
ElevenLabs published a blog post titled “Voice AI for Greece” on June 9, 2026. Without the article body, the confirmed scope is limited to ElevenLabs, Voice AI, and a Greece-related context. It may be relevant to readers tracking multilingual voice generation, localization, and regional AI adoption, but no specific feature, partnership, or model claim can be verified from the title alone.
ggml-webgpu improves prefill speeds for k-quants in llama.cpp PR
r/LocalLLaMA top day5 days agoBenchmark
llama.cpp PR #24225 improves ggml-webgpu matrix multiplication performance for k-quants and refactors matmul paths for Q4/Q5/Q8 and k-quants. In pp512 tests on an M2 Pro, reported speedups range from about 1.33x to 3.78x across Q2_K, Q3_K, Q4_K, Q5_K, and Q6_K. The largest gains appear on Q3_K models, including Qwen and Gemma examples.
Why Apple’s slow-and-steady AI bet is starting to look pretty smart
TechCrunch AI5 days agoCommentary
The piece revisits criticism that Apple has fallen behind in the AI race, especially around Siri and Apple Intelligence. It argues that Apple’s slower approach could look smarter as the industry moves beyond flashy demos toward reliable, integrated user experiences. The key idea is that Apple’s ecosystem, device control, privacy positioning, and developer reach may matter more than racing to ship standalone AI chatbots.
JetBrains Mellum 2: a really good and performant model
r/LocalLLaMA top day5 days agoBenchmark
A r/LocalLLaMA user shared informal impressions of JetBrains Mellum 2, focusing on local coding-style tasks and tool calls. On an AMD Radeon RX 7900 XT with llama.cpp Vulkan and 131K context, the model reportedly generated around 111 tokens/s and stayed above 100 tokens/s near full context. The author stresses this is not a scientific benchmark, but a practical workflow-oriented test.
Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72
r/LocalLLaMA top day5 days agoRelease
Omi Health’s founder says he fine-tuned NVIDIA Parakeet TDT 0.6B v2 for clinical speech and released Omi Med STT v1 under CC-BY-4.0. The runtime supports Mac, Windows, and Linux, auto-selecting MLX, NeMo, or GGUF/parakeet.cpp backends. In the author’s held-out medical benchmark, it reports 2.37% medical-WER and 145× realtime on local A10 compute.
Budgets for API Keys on Vercel AI Gateway
Vercel Changelog5 days agoRelease
Vercel has added per-API-key budget controls to its AI Gateway product, enabling developers to set hard spending limits on individual keys. Once a key hits its budget threshold, the gateway automatically blocks further requests, preventing unexpected cost overruns. This is especially useful for multi-tenant apps, team cost allocation, and isolating dev/test environments from production spending.
Apple’s WWDC AI demos looked more real after $250M false ad settlement
TechCrunch AI6 days agoCommentary
TechCrunch notes that Apple’s WWDC 2026 AI demos felt more concrete and realistic, often showing people holding iPhones in use-case scenarios. The framing matters after Apple’s $250 million settlement over allegedly misleading Siri and Apple Intelligence advertising. The piece focuses less on model breakthroughs and more on Apple’s shift toward demos that look deliverable, usable, and legally safer.
Apple is using AI to fix Safari’s extension problem
The Verge AI6 days agoRelease
Apple is trying to address Safari’s weaker extension ecosystem with AI. Safari has long lagged behind rival browsers in extension availability, partly because of Apple’s stricter development requirements. In a demo shared by Apple, the company showed users effectively “vibe coding” their own Safari extensions, though the excerpt does not detail model support, review flow, or release timing.
Quick note on recent QAT issues
r/LocalLLaMA top day6 days agoCommentary
The post argues that recent Google QAT quantization has several implementation problems, including token embeddings being quantized to q6k instead of using a pure mode. It also claims llama-quantize has a hardcoded parameter that mismatches some optimized groups, and that 32-block groups are misaligned. The author recommends Unsloth UD Q4_K_XL as a temporary option and says they are working on a patch.
Apple plays catch-up at WWDC
TechCrunch AI6 days agoCommentary
Apple spent much of its WWDC keynote on fixes, performance improvements, and long-requested features before unveiling an upgraded AI-powered Siri. The sequencing suggests Apple wants users to see AI as one piece of a larger software-improvement effort. TechCrunch frames the event as Apple playing catch-up, rather than leading with AI as the sole headline.
Apple bets cheaper AI will woo small developers
TechCrunch AI6 days agoBusiness
Apple is trying to make AI experimentation cheaper for smaller developers. According to TechCrunch, developers with fewer than 2 million first-time App Store downloads will have cloud API costs waived. The report frames this as a way to attract smaller teams as AI development and experimentation become increasingly expensive.
llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants
r/LocalLLaMA top day6 days agoRelease
The Reddit post links to ggml-org/llama.cpp Pull Request #24282, which adds MTP support for Gemma-4 E2B and E4B assistants. The submitter frames it as useful for tiny Gemma models on phones, low-end machines, Raspberry Pi, or similarly constrained devices. The post does not include benchmarks, merge status, or setup instructions, so it should be treated as a development signal rather than a finished release.
Introducing FrontierCode★ 78
Hacker News (AI keywords)6 days agoBenchmark
Cognition launched FrontierCode, a coding benchmark focused on mergeability rather than only functional correctness. It evaluates correctness, tests, scope discipline, style, and repository-specific quality standards. Built with open-source maintainers and extensive quality control, it shows current frontier models still struggle: Claude Opus 4.8 scores 13.4% on the hardest Diamond subset, ahead of GPT-5.5 and Gemini 3.1 Pro.
Say hi to Siri AI: Apple announces more conversational voice assistant★ 76
Ars Technica AI6 days agoRelease
Apple announced “Siri AI,” a more conversational version of its voice assistant planned for this fall. The update is tied to a two-tier AI model overhaul powered in part by Google technology. The move signals Apple’s attempt to close the gap with modern AI assistants while preserving its system-level integration and privacy-focused positioning.
Was BitNet a dead end? What happened to ternary LLMs?
r/LocalLLaMA top day6 days agoCommentary
A r/LocalLLaMA user questions whether BitNet and ternary LLMs were a dead end after earlier promise around efficient low-bit models. The post notes that the largest ternary model appears to remain around 2B parameters. It asks why frontier open-weight AI labs are not visibly pursuing the approach, but provides no technical evidence or definitive answer.
Apple just taught your iPhone to finish your sentences, photos, and workflows
TechCrunch AI6 days agoRelease
Apple is bringing new AI-powered features to Safari, Shortcuts, and Passwords apps. The framing suggests AI will be embedded into everyday iPhone tasks, including writing, photo-related actions, and workflow automation. The provided source text does not include details on exact capabilities, device support, privacy design, or rollout timing, so the practical impact remains unclear.
Apple Core AI Framework★ 76
Hacker News (AI keywords)6 days agoRelease
Apple’s Core AI framework is positioned as a developer stack for deploying AI models directly inside apps on Apple silicon. The documentation describes Swift APIs, `.aimodel` assets, model specialization, caching, Xcode profiling, and debugging tools. It appears aimed at developers building low-latency, privacy-conscious on-device inference workflows, though the documentation is marked as preliminary beta information.
Apple will let you build workflows using AI in its new Shortcuts app
TechCrunch AI6 days agoRelease
Apple is upgrading the Shortcuts app in iOS 27 with AI-powered workflow creation. Users will be able to describe what they want in natural language, and Apple Intelligence will assemble the needed system and app actions. The feature is meant to make Shortcuts more approachable for non-technical users, with the updated app expected to roll out with iOS 27 this fall.
For the 2nd time in weeks, Microsoft packages laced with credential stealer★ 72
Ars Technica AI6 days agoIncident
Ars Technica reports a second Microsoft-package security incident in weeks, involving 73 packages laced with a credential stealer. The supplied summary says the malware runs as soon as the packages are opened by an AI agent and can self-replicate. The case highlights a growing software supply-chain risk: AI agents that inspect or operate on code may become execution triggers for malicious packages.
Apple gives Siri its own dedicated app
TechCrunch AI6 days agoNew Tool
TechCrunch reports that Siri is finally getting its own dedicated app. The provided text does not include details about features, launch timing, supported devices, or AI capabilities. The move could signal a more prominent product surface for Siri, but the available source text is too limited to confirm broader strategy or functionality.

← PreviousPage 4Next →

Latest in AI

Is Grep All You Need? How Agent Harnesses Reshape Agentic Search

Rust-native CPU-only LFM2.5-8B-A1B inference library "bebelm" published as cargo crate

Apple Says Its AI is Still Private Even When Running on Google's Servers★ 75

Lovable Hits $500M Annualized Revenue with 1 Million New Projects Weekly★ 82

Jetson Orin NX Build for Hermes Agent + Benchmarking

NeuroBait: I fine-tuned a model to spark dopamine for ADHD brain

Amap Releases ABot-Earth 0.5: Shifting from 2D Distillation to 3D Native for Consistent Scene Generation★ 70

Is a New Player Joining China’s Top-Tier General AI Models?

A 4B Edge-Deployable Cognitive Model Built in China

Xiaohongshu Is Growing a GitHub for AI Skills

Voice AI for Greece

ggml-webgpu improves prefill speeds for k-quants in llama.cpp PR

Why Apple’s slow-and-steady AI bet is starting to look pretty smart

JetBrains Mellum 2: a really good and performant model

Omi Med STT v1: Open-Weight Medical ASR Fine-Tuned from Parakeet 0.6B★ 72

Budgets for API Keys on Vercel AI Gateway

Apple’s WWDC AI demos looked more real after $250M false ad settlement

Apple is using AI to fix Safari’s extension problem

Quick note on recent QAT issues

Apple plays catch-up at WWDC

Apple bets cheaper AI will woo small developers

llama.cpp PR adds MTP support for Gemma-4 E2B and E4B assistants

Introducing FrontierCode★ 78

Say hi to Siri AI: Apple announces more conversational voice assistant★ 76

Was BitNet a dead end? What happened to ternary LLMs?

Apple just taught your iPhone to finish your sentences, photos, and workflows

Apple Core AI Framework★ 76

Apple will let you build workflows using AI in its new Shortcuts app

For the 2nd time in weeks, Microsoft packages laced with credential stealer★ 72

Apple gives Siri its own dedicated app