Latest in AI

Showing:ai-safetyDevelopersClear ×

🔥 Trending today

anthropic6 export-controls4 model-access3 amazon3 national-security2 open-source2 ai-regulation2 government-policy2 enterprise-ai2 compliance2

Topic

Release New Tool Tutorial Business Paper Benchmark Opinion Regulation

For

General Developers Designers Product Founders Marketing Researchers Students

Trump AI testing plan faces problem: DOGE gutted US security teams
Ars Technica AI11 days agoRegulation
Ars Technica reports that Trump’s administration is considering government safety tests for advanced AI models before deployment. Critics argue the plan may be short-sighted and performative because DOGE cuts have weakened the US teams best positioned to conduct serious AI security reviews. The concern is that testing without staffing, transparency, and enforcement may not prevent dangerous deployments.
Expanding Project Glasswing★ 76
Hacker News (AI keywords)12 days agoBusiness
Anthropic is expanding Project Glasswing, its program for using Claude Mythos Preview to find vulnerabilities in critical software. The new cohort includes around 150 organizations across more than 15 countries, including infrastructure providers, vendors, nonprofits, and open-source maintainers. Anthropic frames the expansion as preparation for a world where powerful cyber-capable AI models become cheaper and more widely available, shifting focus from finding bugs to validating, disclosing, patching, and deploying fixes.
Florida sues OpenAI, Sam Altman after multiple ChatGPT-linked murders★ 78
Ars Technica AI13 days agoRegulation
Florida sued OpenAI and CEO Sam Altman over multiple murders described as linked to ChatGPT. The state's attorney general accused Altman of an "utter disregard" for human lives. The provided excerpt does not identify the cases, explain the alleged causal links, specify the legal claims, or include OpenAI's response, so the allegations require further clarification.
LLMs believe false statements even after explicit warnings that they're false★ 74
Ars Technica AI16 days agoPaper
A new study describes “Negation Neglect,” where LLMs fine-tuned on documents that explicitly mark claims as false still learn the claims as true. Experiments with fabricated statements found models often absorb entity-event associations more strongly than surrounding warnings or negations. The finding raises concerns for fine-tuning pipelines, misinformation handling, and AI safety datasets that include harmful or false content with disclaimers.
Claude’s new model is more ‘honest’ when it messes up
The Verge AI17 days agoRelease
Anthropic is releasing Claude Opus 4.8 and highlighting the model’s “honesty” as a key improvement. The company says it trains its models to avoid unsupported claims, addressing a broader issue where AI systems sometimes jump to conclusions. Based on the provided excerpt, the update is positioned around reliability and uncertainty handling rather than a specific new tool or benchmark result.
At TechCrunch Disrupt 2026: Databricks co-founder on what kills enterprise AI deals
TechCrunch AI17 days agoBusiness
TechCrunch frames enterprise AI as entering a new phase, where companies are no longer mainly asking whether AI is exciting. The harder question is whether it can be deployed safely at scale. Centered on a TechCrunch Disrupt 2026 discussion with a Databricks co-founder, the article points to safety and broad rollout readiness as key enterprise AI deal concerns.
Google AI 搜尋出現大漏洞！搜尋「disregard」竟讓 AI 忽視指令並吐出聊天機器人預設回覆
The Verge AI22 days agoIncident
Google's AI search feature, "AI Overviews," was recently found by users on the social platform X to have a rather absurd system vulnerability. When a user…
美國政府緊急應對：網友利用 AI 模擬罹難飛行員聲音，規避法律限制★ 75
Ars Technica AI23 days agoIncident
This controversy stems from strict U.S. legal restrictions on aviation accident investigation data. Under federal law, the National Transportation Safety Board…
科技巨頭 CEO 拒絕出席，川普突取消 AI 安全測試行政命令簽署儀式並稱其「阻礙創新」★ 75
Ars Technica AI23 days agoBusiness
According to a report by Ars Technica, U.S. President Donald Trump abruptly canceled an official event that had been scheduled for the signing of an executive…
你現在無法在 Google 搜尋「disregard」這個單字了：AI 更新導致搜尋介面崩潰★ 75
TechCrunch AI23 days agoIncident
According to a TechCrunch report, following a recent AI feature update to Google Search, a baffling system bug emerged: users can now cause the entire Google…
川普延後簽署 AI 安全行政命令，稱原有條款可能成為發展阻礙★ 80
TechCrunch AI24 days agoBusiness
US President Donald Trump recently decided to delay signing a highly anticipated AI safety executive order. The core of the order was to establish a…
Google 的 SynthID AI 水印技術獲 OpenAI、NVIDIA 等巨頭採用★ 85
Ars Technica AI26 days agoBusiness
As generative AI technology advances at a breakneck pace, AI-generated text, images, audio, and video have reached a point where they are nearly…
讓使用者更輕鬆了解網頁內容的建立與編輯來源：Google 擴大推廣內容憑證與 SynthID 技術★ 78
Google DeepMind Blog28 days agoRelease
As generative AI technology becomes more widespread, the internet is increasingly flooded with images and information that are difficult to distinguish as real…
Import AI 455：AI 系統即將開始自我構建——邁向遞迴自我提升的第一步★ 85
Import AI (Jack Clark)41 days agoCommentary
In the latest issue of Import AI 455, Jack Clark guides readers through an exploration of a highly forward-looking and both exciting and concerning theme: AI…
AI 與網路安全的未來：為什麼「開放」至關重要★ 75
Hugging Face Blog54 days agoOpinion
As artificial intelligence (AI) technology undergoes explosive growth, cybersecurity has become a focal point of concern for governments and enterprises…
Import AI 454：自動化對齊研究、中國 AI 模型安全評估與全新 4 位元浮點格式 HiFloat4★ 75
Import AI (Jack Clark)55 days agoCommentary
In this issue of Import AI 454, written by Jack Clark, the author begins by posing a thought-provoking question about finance and sociology: "At what point…
Import AI 453：破解 AI Agent、MirrorCode，以及關於「漸進式失權」的十種觀點★ 75
Import AI (Jack Clark)62 days agoCommentary
This issue of Import AI (Issue 453), written by Anthropic co-founder Jack Clark, centers on AI system safety, coding capabilities, and the future of humanity…
Claude 神話與對開源權重模型無謂的恐慌★ 75
Interconnects (Nathan L.)65 days agoOpinion
In this opinion piece published in Interconnects, prominent AI policy and technology critic Nathan Lambert delivers a sharp critique of the excessive panic…
Google DeepMind 發表最新研究：防範 AI 在金融與醫療領域的有害操縱風險★ 75
Google DeepMind Blog81 days agoRelease
Google DeepMind has recently published research findings on preventing harmful manipulation by AI. As large language models (LLMs) and AI Agents become…
ImportAI 449：LLM 訓練 LLM、72B 分散式訓練、為什麼電腦視覺比文本生成更難？以及 AI 是否會引發政治過渡期？★ 75
Import AI (Jack Clark)90 days agoCommentary
This issue of Import AI (No. 449) dives deep into several core frontier topics in the current AI landscape, spanning technical breakthroughs and broad…
Import AI 445：超級智能的時機點、AI 破解前沿數學證明與全新機器學習研究基準★ 75
Import AI (Jack Clark)118 days agoCommentary
In this edition of Import AI (Issue 445), author Jack Clark guides readers through three core topics at the very frontier of AI development: the timeline for…
Import AI 443：走入迷霧：Moltbook、Agent 生態系與轉型中的網際網路★ 75
Import AI (Jack Clark)132 days agoCommentary
In this edition of Import AI 443, author Jack Clark guides readers through a far-reaching trend already underway: the internet is transforming from…
AssistLoop 正式加入 Vercel Agents 市集，為 AI Agent 引入人機協同機制
Vercel Changelog135 days agoRelease
Vercel officially announced that AssistLoop, an AI collaboration platform focused on "Human-in-the-Loop (HITL)" mechanisms, has joined the Vercel Agents…
Hugging Face 發表「開放回應（Open Responses）」：關於開源 AI 政策倡議你該知道的事★ 75
Hugging Face Blog150 days agoOpinion
Hugging Face, the world's largest open-source AI community platform, has published an article titled "Open Responses," aimed at explaining to developers and…
Import AI 440：紅皇后效應 AI、AI 監管 AI 與 O型環自動化理論★ 75
Import AI (Jack Clark)153 days agoOpinion
In the latest issue of Import AI 440, author Jack Clark delves into three key structural trends facing AI development today: the Red Queen Effect, the…
ServiceNow AI 推出 AprielGuard：提升現代 LLM 系統安全與對抗防禦能力的防護欄模型★ 75
Hugging Face Blog173 days agoRelease
As large language models (LLMs) are widely deployed across enterprises and various applications, ensuring the safety of their outputs and defending against…
Google DeepMind 推出 Gemma Scope 2：支援全新 Gemma 3 家族，深化 AI 安全與可解釋性研究★ 75
Google DeepMind Blog180 days agoRelease
Google DeepMind has officially released Gemma Scope 2, extending its powerful open-source model interpretability tools to the latest Gemma 3 model family. This…
Google DeepMind 深化與英國 AI 安全研究所（UK AISI）的合作關係★ 75
Google DeepMind Blog185 days agoBusiness
Google DeepMind has announced a deepened collaboration with the UK AI Security Institute (UK AISI), with both parties committing to joint work on critical AI…
Google DeepMind 將 AI 圖像驗證技術（SynthID）導入 Gemini 應用程式★ 80
Google DeepMind Blog206 days agoRelease
With the rapid advancement of generative AI technology, identifying the authenticity of images and defending against deepfakes has become an urgent priority…
Hugging Face 推出 Voice Consent Gate：為語音複製建立安全授權機制★ 75
Hugging Face Blog229 days agoNew Tool
With the rapid advancement of voice cloning technology, generating hyper-realistic synthetic voices has become remarkably easy. However, this has also…

← PreviousPage 2Next →

Latest in AI

Trump AI testing plan faces problem: DOGE gutted US security teams

Expanding Project Glasswing★ 76

Florida sues OpenAI, Sam Altman after multiple ChatGPT-linked murders★ 78

LLMs believe false statements even after explicit warnings that they're false★ 74

Claude’s new model is more ‘honest’ when it messes up

At TechCrunch Disrupt 2026: Databricks co-founder on what kills enterprise AI deals

Google AI 搜尋出現大漏洞！搜尋「disregard」竟讓 AI 忽視指令並吐出聊天機器人預設回覆

美國政府緊急應對：網友利用 AI 模擬罹難飛行員聲音，規避法律限制★ 75

科技巨頭 CEO 拒絕出席，川普突取消 AI 安全測試行政命令簽署儀式並稱其「阻礙創新」★ 75

你現在無法在 Google 搜尋「disregard」這個單字了：AI 更新導致搜尋介面崩潰★ 75

川普延後簽署 AI 安全行政命令，稱原有條款可能成為發展阻礙★ 80

Google 的 SynthID AI 水印技術獲 OpenAI、NVIDIA 等巨頭採用★ 85

讓使用者更輕鬆了解網頁內容的建立與編輯來源：Google 擴大推廣內容憑證與 SynthID 技術★ 78

Import AI 455：AI 系統即將開始自我構建——邁向遞迴自我提升的第一步★ 85

AI 與網路安全的未來：為什麼「開放」至關重要★ 75

Import AI 454：自動化對齊研究、中國 AI 模型安全評估與全新 4 位元浮點格式 HiFloat4★ 75

Import AI 453：破解 AI Agent、MirrorCode，以及關於「漸進式失權」的十種觀點★ 75

Claude 神話與對開源權重模型無謂的恐慌★ 75

Google DeepMind 發表最新研究：防範 AI 在金融與醫療領域的有害操縱風險★ 75

ImportAI 449：LLM 訓練 LLM、72B 分散式訓練、為什麼電腦視覺比文本生成更難？以及 AI 是否會引發政治過渡期？★ 75

Import AI 445：超級智能的時機點、AI 破解前沿數學證明與全新機器學習研究基準★ 75

Import AI 443：走入迷霧：Moltbook、Agent 生態系與轉型中的網際網路★ 75

AssistLoop 正式加入 Vercel Agents 市集，為 AI Agent 引入人機協同機制

Hugging Face 發表「開放回應（Open Responses）」：關於開源 AI 政策倡議你該知道的事★ 75

Import AI 440：紅皇后效應 AI、AI 監管 AI 與 O型環自動化理論★ 75

ServiceNow AI 推出 AprielGuard：提升現代 LLM 系統安全與對抗防禦能力的防護欄模型★ 75

Google DeepMind 推出 Gemma Scope 2：支援全新 Gemma 3 家族，深化 AI 安全與可解釋性研究★ 75

Google DeepMind 深化與英國 AI 安全研究所（UK AISI）的合作關係★ 75

Google DeepMind 將 AI 圖像驗證技術（SynthID）導入 Gemini 應用程式★ 80

Hugging Face 推出 Voice Consent Gate：為語音複製建立安全授權機制★ 75