KPMG, one of the world's largest professional services firms, withdrew a published report on AI usage after it was found to contain apparent hallucinations — errors likely introduced by an AI system used in its preparation. The incident highlights a sharp irony: AI proving unreliable as a source of information about AI itself. It adds to a growing list of high-profile cases where AI-generated content has undermined the credibility of professional and institutional outputs.
In a rare legal incident, a judge found that attorneys on both sides of a case had used AI tools in their legal work. The judge responded by canceling the trial entirely and dismissing all lawyers involved. The case highlights growing judicial frustration with unchecked AI use in court filings and the serious professional consequences that can follow.
A popular Reddit post highlights a video demonstrating a "Fully Hallucinated Operating System" run entirely inside an LLM. By prompting the model to act as a terminal, it simulates file systems, network requests, and command execution purely through text generation. While impractical for production, this experiment showcases the impressive state-tracking and "world model" capabilities of modern LLMs.
Anthropic is releasing Claude Opus 4.8 and highlighting the model’s “honesty” as a key improvement. The company says it trains its models to avoid unsupported claims, addressing a broader issue where AI systems sometimes jump to conclusions. Based on the provided excerpt, the update is positioned around reliability and uncertainty handling rather than a specific new tool or benchmark result.
In an era of rapidly growing AI-assisted writing, the collaboration between writers and AI is undergoing unprecedented tests. Author and documentary filmmaker…
The well-known academic preprint platform arXiv has recently introduced strict new rules regarding AI-generated content. According to the latest policy…
As large language models (LLMs) are deployed across a wide range of industries, ensuring the "factuality" of model outputs and reducing "hallucination" has…
Hugging Face has partnered with Patronus AI — a startup focused on LLM evaluation and defense — to officially launch the **Enterprise Scenarios Leaderboard**…
While large language models (LLMs) have demonstrated remarkable generative capabilities across many domains, "hallucination" — where a model confidently…
In the open-source AI community, the Hugging Face Open LLM Leaderboard serves as an important benchmark for evaluating model capabilities. However, many…