Google新実験モデル「Gemini-Exp-1114」が世界トップに

Google新実験モデル「Gemini-Exp-1114」が世界トップに

18/11/2024

11月14日にリリースされたGoogleのGemini新実験モデル「Gemini-Exp-1114」がベンチマークでOpenAI o1-preview(2024年9月12日リリース)やClaude 3.5 Sonnet (2024年10月22日アップデート)を抑えて世界1位のAIモデルになったようです。AIチャットボットの実力を対戦形式で評価するプラットフォーム「Chatbot Arena」での対戦成績では、GPT-4oとの対戦で勝率50%、Claude 3.5 Sonnetとの対戦で勝率62%、o1-previewとの対戦で勝率56%を記録。総合スコアは1344点となり、前バージョンから40ポイントの向上を見せたということです。まだ、実験版で、性能は安定していないとの報告もありますが、数学、創作文章、長文クエリ処理、指示への従順性などで優れているようです。
それぞれの生成AIモデルの進化と特徴に応じた使い分けが進むのかもしれません。

Gemini Exp 1114：史上最高のLLM！o1-プレビューとClaude 3.5 Sonnetを上回る！（完全テスト済み）
2024年11月18日
https://note.com/kind_crocus236/n/n95ba48c4f038

gemini-exp-1114モデルが総合分野一位を獲得！その驚異的な能力とは？
https://www.ai-box.biz/post/chatbot-arena-gemini-exp-1114-ranking-analysis

Google新AIモデル「Gemini-Exp-1114」が世界トップに！スモールビジネスの業務効率化に革新をもたらす期待の新星
2024年11月15日
https://ai-wave.jp/2024/11/15/gemini-exp-1114-google-ai-model/

Gemini-exp-1114を最速触ってみた
2024年11月15日
https://note.com/ktworks/n/nbad509e22d17

【Gemini-exp-1114】ベンチマークでGPT-4oを超えた世界最高のLLM！
2024-11-16
https://weel.co.jp/media/tech/gemiini-exp-1114/

【GPT-4o/o1超え】GoogleのGeminiがChatGPTを抑えて世界1位のAIモデルに!!
『Gemini-Exp-1114』を徹底解説。活用事例5選も紹介。
November 17, 2024
https://ai-database.beehiiv.com/p/gemini-1114?_bhlid=accbb25973719f993ab3ee7509f1822e9b6179ee&utm_campaign=gpt-4o-o1-google-gemini-chatgpt-1-ai&utm_medium=newsletter&utm_source=ai-database.beehiiv.com

Google's New Experimental Model 'Gemini-Exp-1114' Becomes the World's Top AI Model

Google's new experimental model "Gemini-Exp-1114," released on November 14, has become the world's top AI model, outperforming OpenAI's o1-preview (released on September 12, 2024) and Claude 3.5 Sonnet (updated on October 22, 2024) in benchmark tests.
On the "Chatbot Arena" platform, which evaluates AI chatbot capabilities through competitive matches, Gemini-Exp-1114 achieved a 50% win rate against GPT-4o, 62% against Claude 3.5 Sonnet, and 56% against o1-preview. It achieved an overall score of 1344 points, marking a 40-point improvement from the previous version.
While reports suggest that it remains an experimental model with unstable performance, it excels in areas such as mathematics, creative writing, processing long queries, and adhering to instructions. This may signal a growing trend of using different AI models based on their specific strengths and stages of development.

Leave a Reply.