最先端AIは1%未満、人間は100%解ける「ARC-AGI-3」

27/3/2026

2026年3月25日、汎用人工知能（AGI）の実現に不可欠な「知能」を測定するための新たなベンチマーク「ARC-AGI-3」が公開されました。
このベンチマーク「ARC-AGI-3」は、AIエージェントが未知の動的な環境にインタラクティブに関与し、自律的にルールを学習する能力を測定する、世界初の「対話型推論ベンチマーク（Interactive Reasoning Benchmark）」とされ、従来の静的なパズルとは異なりルールや目標が明示されない対話型ゲーム環境でAIがどれだけ素早く適応できるかを評価します。
公開時点の結果では、人間が100%解けるのに対し、最先端のAIモデルは1%未満という極めて低いスコアを記録しました。この結果は、現在のAIが持つ「暗記と検索」の限界を浮き彫りにし、真の汎用人工知能（AGI）実現には未知の環境での探索や計画能力といった新たな突破口が必要であることを示唆しているということです。
生成AIに「ARC-AGI-3」について深掘りさせました。なお、生成AIによる調査・分析結果は、公開された情報からだけの分析であり、必ずしも実情を示したものではないこと、誤った情報も含まれていることについてはご留意されたうえで、ご参照ください。

ARC-AGI-3: A New Challenge for Frontier Agentic Intelligence
ARC Prize Foundation ∗
March 24, 2026
https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf

2026年03月26日
AIの知能をルール不明のゲームで測定する「ARC-AGI-3」が登場、AIはまだクリアできないが人間には100％クリアできるゲームを実際にプレイ可能
https://gigazine.net/news/20260326-arc-agi-3/

“ARC-AGI-3”: Humans Solve 100%, Cutting-Edge AI Below 1%
On March 25, 2026, a new benchmark called “ARC-AGI-3” was released to measure “intelligence,” a capability considered essential for achieving Artificial General Intelligence (AGI).
This benchmark, “ARC-AGI-3,” is described as the world’s first Interactive Reasoning Benchmark, designed to evaluate an AI agent’s ability to engage with unknown, dynamic environments and autonomously learn rules through interaction. Unlike conventional static puzzles, it assesses how quickly AI can adapt within interactive game environments where neither the rules nor the objectives are explicitly defined.
At the time of its release, results showed that while humans were able to solve 100% of the tasks, state-of-the-art AI models scored below 1%, an extremely low level of performance. This outcome highlights the limitations of current AI systems, which largely rely on memorization and retrieval, and suggests that achieving true AGI will require new breakthroughs—particularly in capabilities such as exploration and planning in unknown environments.
I asked generative AI to provide a deeper analysis of “ARC-AGI-3.” Please note that the following analysis is based solely on publicly available information generated by AI and may not fully reflect reality; it may also contain inaccuracies.

0 Comments

よろず知財コンサルティングのブログ