中国Moonshot AIの「Kimi K2 Thinking」

12/11/2025

中国のMoonshot AIが11月6日に、エージェント性能に特化したAIモデル「Kimi K2 Thinking」を公開しました。Hugging Faceでオープンウェイトモデルとして公開されています。性能面では、総合的な推論力を測る「Humanity’s Last Exam」で44.9%という正解率で同条件のGPT-5 (41.7%)やClaude 4.5 (32.0%)を上回り、ウェブ検索を伴うエージェント推論テストの「BrowseComp」でも60.2%のスコアでGPT-5の54.9%やClaude 4.5の24.1%を超えるスコアを達成しました。Kimi K2 Thinkingは、オープンソースAIの新たなマイルストーンとも言える存在のようです。
生成AIに「Kimi K2 Thinking」の詳細を分析させました。なお、生成AIによる調査・分析結果は、公開された情報からだけの分析であり、必ずしも実情を示したものではないこと、誤った情報も含まれていることについてはご留意されたうえで、ご参照ください。

中国発の新AI「Kimi K2 Thinking」、米国製を超える性能と“無料”の衝撃
2025-11-10
https://japan.zdnet.com/article/35240260/

China’s Moonshot AI Releases “Kimi K2 Thinking”
On November 6, China’s Moonshot AI released a new AI model called “Kimi K2 Thinking”, designed specifically for agent-based performance. The model is available as an open-weight release on Hugging Face.
In terms of performance, it achieved a 44.9% accuracy on “Humanity’s Last Exam,” a benchmark for general reasoning ability—surpassing GPT-5 (41.7%) and Claude 4.5 (32.0%) under the same conditions. In another agent reasoning test involving web search, “BrowseComp,” it scored 60.2%, outperforming GPT-5 (54.9%) and Claude 4.5 (24.1%).
“Kimi K2 Thinking” is being recognized as a new milestone in open-source AI.
Note: The analysis of “Kimi K2 Thinking” by generative AI is based solely on publicly available information and may not accurately reflect the full picture. Some inaccuracies may also be present, so please review the findings with that in mind.

0 Comments

よろず知財コンサルティングのブログ