Latest in AI

Showing:GeminiClear ×

🔥 Trending today

anthropic7 export-controls4 model-access3 spacex3 amazon3 national-security2 open-source2 governance2 ai-policy2 ai-regulation2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Say hi to Siri AI: Apple announces more conversational voice assistant★ 76
Ars Technica AI5 days agoRelease
Apple announced “Siri AI,” a more conversational version of its voice assistant planned for this fall. The update is tied to a two-tier AI model overhaul powered in part by Google technology. The move signals Apple’s attempt to close the gap with modern AI assistants while preserving its system-level integration and privacy-focused positioning.
Apple Reveals New AI Architecture Built Around Google Gemini Models★ 78
Hacker News (AI keywords)5 days agoRelease
Apple announced a major Apple Intelligence overhaul built around Apple Foundation Models co-developed with Google using technologies behind Gemini. The architecture supports on-device and Private Cloud Compute execution, with stronger reasoning, understanding, and multimodal capabilities. A new system orchestrator coordinates AI features across Apple platforms, though Apple has not yet specified which devices receive the higher-power model.
Gemini 3.5 and Antigravity come to Google NotebookLM
Ars Technica AI5 days agoRelease
Google is upgrading NotebookLM with Gemini 3.5 and Antigravity, pushing the product beyond source-based Q&A into more agentic research workflows. The update adds a secure cloud computer for each notebook, enabling code execution, deeper analysis, and richer file outputs. For now, availability is limited to AI Ultra and enterprise customers, with broader rollout planned later.
NotebookLM’s Gemini 3.5 upgrade adds a cloud computer and help finding sources
The Verge AI6 days agoRelease
Google is rolling out broad updates to NotebookLM, its AI-powered note-taking and research app launched in 2023. The app now uses Google’s upgraded Gemini 3.5 model, which the company says should provide more accurate and reliable responses. The update also adds a cloud computer and help finding sources, expanding NotebookLM beyond source-based Q&A into a broader research assistant workflow.
[3090] Gemma4 QAT + MTP quick TPS numbers
r/LocalLLaMA top day6 days agoBenchmark
A r/LocalLLaMA user shared quick throughput numbers for Gemma4 QAT with MTP speculative decoding on an RTX 3090 24GB setup. They report roughly 1.2-1.8x TPS improvement, with Gemma 4 31B moving from about 40 tok/s to 70-80 tok/s. The author frames this as a rough benchmark, using 11 task categories and noting stochastic variation from temp 1.0.
Gemma 4 Chat Template now has preserve thinking
r/LocalLLaMA top day6 days agoRelease
A r/LocalLLaMA post notes that Gemma 4’s chat template now has “preserve thinking.” The linked discussion points to google/gemma-4-31B-it on Hugging Face, suggesting a template-level change rather than a new model release or benchmark. The original post does not provide detailed usage notes, defaults, compatibility information, or measured effects.
Google DeepMind RCT in Sierra Leone Shows Gemini's Guided Learning Boosts Education★ 72
Google DeepMind Blog6 days agoPaper
Google DeepMind released results from a randomized controlled trial (RCT) in Sierra Leone evaluating AI's impact on education. The study found that Gemini’s "Guided Learning" feature, which guides students instead of just giving answers, significantly boosted engagement. This research provides rigorous empirical evidence that AI tutoring can accelerate learning and help bridge educational gaps in resource-constrained regions.
Upgrading agentic coding capabilities with the new Devstral models★ 72
Mistral AI News6 days agoRelease
Mistral AI announced two Devstral updates focused on agentic coding workflows: Devstral Small 1.1 and Devstral Medium. Devstral Small 1.1 remains a 24B Apache 2.0 open model and reaches 53.6% on SWE-Bench Verified. Devstral Medium reaches 61.6%, is available through Mistral’s API, and supports private deployment and custom finetuning for enterprises.
Voxtral★ 78
Mistral AI News6 days agoRelease
Mistral AI introduces Voxtral, a speech understanding model family with 24B and 3B variants under Apache 2.0. The models support long-context transcription, audio Q&A, summarization, multilingual detection, and function calling from voice. Mistral says Voxtral is competitive across transcription and audio understanding benchmarks, with API access starting at $0.001 per minute and local downloads available on Hugging Face.
Altman, Amodei, and Hassabis Unite to Back DNA Safety Legislation
量子位 QbitAI6 days agoRegulation
Based on the headline and public reporting, the article covers a rare joint push by Sam Altman, Dario Amodei, Demis Hassabis, and other AI leaders for US biosecurity legislation. They are asking lawmakers to require synthetic DNA and RNA providers to screen customers, orders, and records. The concern is that advanced AI could lower the knowledge barrier for designing dangerous biological agents.
ElevenAPI
ElevenLabs Blog6 days agoNew Tool
ElevenAPI is a developer category on the ElevenLabs blog rather than a single detailed article. It collects updates and tutorials around speech, music, conversational agents, API keys, web components, and integrations. Listed posts mention Lovable, ElevenLabs UI, Music API, Claude 3.7 Sonnet, Gemini 2.0 Flash, DeepSeek R1, Voice Isolator API, timestamped TTS endpoints, and Speech-to-Speech API.
Introducing Claude Opus 4.8★ 82
Anthropic News6 days agoRelease
Anthropic introduced Claude Opus 4.8 as an upgrade over Opus 4.7, with stronger benchmark performance across coding, agentic skills, reasoning, and knowledge work. The release also adds dynamic workflows in Claude Code, effort controls in claude.ai and Cowork, and new Messages API support for system entries inside the messages array. Pricing for regular usage remains unchanged, while fast mode is now cheaper than previous models.
Thoughts on Gemma4 12B vs 26A4B: Which Is Better?
r/LocalLLaMA top day6 days agoOpinion
The post asks the LocalLLaMA community to compare Gemma4 12B and 26A4B, explicitly excluding the 31B model from discussion. The user is mainly interested in creative tasks, writing, and chatting, with coding treated as optional rather than central. No benchmarks or examples are provided, so the post is best read as a model-selection question about subjective quality and practical use.
Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL
r/LocalLLaMA top day6 days agoCommentary
An analysis of Gemma 4 QAT GGUF files reveals that Google's official 'Q4_0' releases actually employ a mixed-precision strategy. For smaller models like E2B and E4B, Google keeps critical token embeddings in Q6_K and certain projection weights in F16. This makes Google's Q4_0 files larger and more precise than Unsloth's 'Q4_K_XL' versions, which default to standard Q4_0 for almost all tensors.
Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75
r/LocalLLaMA top day6 days agoBenchmark
A Reddit user shared benchmark results showing Google's Gemma 4 31B (FP8) performing on par with Claude Sonnet 4.6 Medium. The custom evaluation harness tested complex tasks including Neo4j Cypher queries, entity extraction, agentic tool calling, Python coding, and multi-vector retrieval synthesis. This highlights how quantized mid-sized open-source models are closing the gap with leading proprietary frontier models.
User Shares Gemma 4 QAT Experience: Improved Quality and MTP Speedups
r/LocalLLaMA top day6 days agoOpinion
A Reddit user shared their experience with the Gemma 4 31B QAT (Quantization-Aware Training) model. Compared to traditional GGUF quants like Q6_K_L, the QAT version delivers noticeable quality improvements in roleplay and long-context tasks. Additionally, combining the QAT model with Multi-Token Prediction (MTP) yielded massive speedups, boosting generation speeds from ~20 t/s to up to 50 t/s.
MTP and QAT: What is the Relation? Running Gemma 4 31B in llama.cpp
r/LocalLLaMA top day6 days agoCommentary
A popular Reddit thread addresses user confusion over running Gemma 4 31B locally. It distinguishes between MTP (Multi-Token Prediction for inference speedup) and QAT (Quantization-Aware Training for preserving 4-bit quality). It also confirms that llama.cpp's new MTP support requires updated GGUF files and a secondary draft model file for acceleration.
I design with Claude more than Figma now
Hacker News (AI keywords)7 days agoOpinion
Jane Street designer Edwin Morris describes moving from skepticism about LLMs to using Claude as a core design tool. Instead of relying mainly on specs and Figma mockups, he now builds working prototypes directly in the real codebase. The post also explores the collaboration risks: prototypes must remain disposable proposals, not finished features that shut reviewers out of design input.
Here comes new Siri again
The Verge AI8 days agoCommentary
The Verge frames Apple as behind in AI, but argues that lagging may not be entirely bad. At WWDC, Apple appears ready to introduce the new Siri again after earlier Apple Intelligence promises slipped. The key question is whether Apple can turn AI into a reliable, system-level assistant experience rather than another generic chatbot feature set.
Mantine DataTable source repo compromised; owner account suspended★ 74
Hacker News (AI keywords)9 days agoIncident
A GitHub security notice says Mantine DataTable and other repositories received unauthorized commits through the github-actions bot. The npm packages were reported safe; the risk targets developers who recently cloned or pulled the source and open it in VS Code, Cursor, Claude Code, Gemini, or run npm test. A later update links the payload to the Miasma / Shai-Hulud worm family and says a stolen credential is the likely path.
This is your laptop… on AI
The Verge AI9 days agoHardware
The episode frames developer conference season around Big Tech’s conviction that AI will reshape how people use technology. Nvidia CEO Jensen Huang is highlighted for describing a completely new way to use laptops. Based on the provided excerpt, this is more of an industry commentary on AI PCs than a concrete product-spec report.
The token bill comes due: Inside the scramble to manage AI costs★ 78
TechCrunch AI9 days agoBusiness
TechCrunch reports that enterprise AI spending has shifted from rapid adoption to cost control. Even as per-token prices fall, broader AI rollout and agentic coding tools are multiplying consumption, pushing companies over budget. A new Tokenomics Foundation under the Linux Foundation aims to standardize AI token cost tracking, billing metrics, and efficiency language.
Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG★ 72
Google Research Blog9 days agoRelease
Google Research and Google Cloud introduced an agentic RAG framework hosted on Gemini Enterprise Agent Platform. It uses multiple agents to plan, rewrite, route, retrieve, verify sufficient context, iterate, and synthesize answers. Google reports up to 34% factuality accuracy gains over standard RAG, plus 90.1% accuracy in a cross-corpus FramesQA setting with similar latency to single-corpus retrieval.
Reve 2 and Ideogram 4: Layouts in Imagegen
Latent Space10 days agoRelease
Latent Space’s roundup frames image composition as a major barrier now being tackled by layout-aware image models. Reve 2.0 emphasizes precise generation and editing with layouts, while Ideogram 4.0 uses bounding boxes tied to region descriptions. The issue also covers MAI-Thinking-1, Gemma 4 12B, open audio models, agent execution layers, and model-routing cost debates.
I built a vulnerable app and spent $1,500 seeing if LLMs could hack it
Hacker News (AI keywords)10 days agoBenchmark
The author built a vulnerable React Native app with a Python backend and a Firebase access-control flaw. GPT 5.5 solved 7 of 10 runs, while Deepseek and Claude variants solved fewer attempts. Many other models failed due to refusals, API-focused tunnel vision, false positives, or inability to use the exposed Firebase path correctly.
How LLMs Actually Work
Hacker News (AI keywords)10 days agoTutorial
The article explains how modern LLMs convert text into token IDs, embeddings, and position-aware vectors before passing them through stacked transformer blocks. It covers attention, multi-head attention, KV cache, GQA, feed-forward networks, MoE, residual streams, normalization, and decoding. Its goal is educational: helping readers understand the common architecture behind many current model families and read model cards or papers more confidently.
Google's Gemma 4 12B is designed to run on 16GB RAM laptops
Ars Technica AI10 days agoRelease
Google introduced Gemma 4 12B, an open model aimed at running locally on laptops with 16GB of RAM. The model uses a new encoding scheme and token prediction to improve efficiency relative to its size. Its practical importance depends on real-world benchmarks, but it could lower the barrier for private, offline, and local multimodal AI workflows.
As AI gets better, it reveals an empty promise
The Verge AI11 days agoCommentary
The piece uses Google’s Gemini agent Spark as a starting point: its contextual awareness and task execution are impressive, even unsettling. But the author argues AI productivity tools mostly optimize problems created by modern software and work culture. Better assistants may schedule meetings and organize life, yet they cannot fix wage stagnation, layoffs, affordability, surveillance, or a weak social safety net.
Publishers will be able to opt out of AI Search, thanks to new regulation★ 72
TechCrunch AI11 days agoRegulation
UK regulators are requiring Google to provide a tool that lets website publishers opt out of generative AI Search features. The option will be tested in the UK first, then rolled out globally. The report does not specify the exact mechanism, timing, or whether opting out affects standard Google Search indexing.
Microsoft Build: MAI-Thinking-1 and MAI Family Models★ 78
Latent Space11 days agoRelease
Microsoft used Build to present itself as both an AI platform and a first-party model lab, announcing seven MAI models across reasoning, code, image, transcription, and voice. The standout was MAI-Thinking-1, described as a 35B active MoE with 256K context and clean data lineage. The recap also ties the launches to GitHub Copilot, Windows agent runtime ambitions, Web IQ grounding APIs, Foundry distribution, and MAIA 200 hardware.

← PreviousPage 2Next →

Latest in AI

Say hi to Siri AI: Apple announces more conversational voice assistant★ 76

Apple Reveals New AI Architecture Built Around Google Gemini Models★ 78

Gemini 3.5 and Antigravity come to Google NotebookLM

NotebookLM’s Gemini 3.5 upgrade adds a cloud computer and help finding sources

[3090] Gemma4 QAT + MTP quick TPS numbers

Gemma 4 Chat Template now has preserve thinking

Google DeepMind RCT in Sierra Leone Shows Gemini's Guided Learning Boosts Education★ 72

Upgrading agentic coding capabilities with the new Devstral models★ 72

Voxtral★ 78

Altman, Amodei, and Hassabis Unite to Back DNA Safety Legislation

ElevenAPI

Introducing Claude Opus 4.8★ 82

Thoughts on Gemma4 12B vs 26A4B: Which Is Better?

Google's Official Gemma 4 QAT Q4_0 GGUFs Have Higher Precision Than Unsloth's Q4_K_XL

Gemma 4 31B FP8 Matches Claude Sonnet 4.6 Medium in Custom Benchmark★ 75

User Shares Gemma 4 QAT Experience: Improved Quality and MTP Speedups

MTP and QAT: What is the Relation? Running Gemma 4 31B in llama.cpp

I design with Claude more than Figma now

Here comes new Siri again

Mantine DataTable source repo compromised; owner account suspended★ 74

This is your laptop… on AI

The token bill comes due: Inside the scramble to manage AI costs★ 78

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG★ 72

Reve 2 and Ideogram 4: Layouts in Imagegen

I built a vulnerable app and spent $1,500 seeing if LLMs could hack it

How LLMs Actually Work

Google's Gemma 4 12B is designed to run on 16GB RAM laptops

As AI gets better, it reveals an empty promise

Publishers will be able to opt out of AI Search, thanks to new regulation★ 72

Microsoft Build: MAI-Thinking-1 and MAI Family Models★ 78