Import AI 460 covers SocioHack, a benchmark where RL-trained LLMs discover loopholes in institutional rule systems. It also discusses Anthropic evidence for a practical form of recursive self-improvement, reflected in sharply increased code merged during 2026. Other sections examine multi-agent RL drones outperforming a champion human pilot, plus research showing state-controlled media can shape LLM responses in local languages.
In March 2016, Google DeepMind's AlphaGo faced legendary Go player Lee Sedol in a historic match in Seoul, ultimately winning 4 to 1. The match not only…
The International Mathematical Olympiad (IMO) has been held annually since 1959 and is the most prestigious and difficult mathematics competition for high…
Google DeepMind has announced a strategic partnership with Commonwealth Fusion Systems (CFS), a nuclear fusion startup spun out of the Massachusetts Institute…
OpenAI recently held a live stream and published a blog post to officially announce the new reasoning model o3 and the lightweight reasoning model o4-mini…
### Background and the Goals of the Open-R1 Project Since the release of DeepSeek-R1, its powerful reasoning capability and remarkably low training cost have…