生成AIがニュースコンテンツを正確に検索し引用する能力

24/3/2025

ニュースコンテンツを正確に検索し引用する能力を評価するため、リアルタイム検索機能を持つ8つの生成型検索ツール（ChatGPT、Perplexity、Perplexity Pro、Copilot、Gemini、DeepSeek、Grok 2、Grok 3）を、20の出版社から各10記事をランダムに選び、その抜粋を各チャットボットに提供して対応する記事の見出し、元の出版社、発行日、URLを特定するよう依頼した結果、これらのチャットボットは全体として60％以上のクエリに対して不正確な回答を提供することがわかり、Perplexityはクエリの37％に誤った回答をした一方、Grok 3は94％という高いエラー率を示したということです。
感覚的には、かなり間違いが多いというこの結果には納得できます。

2025 Mar 24
生成AIの検索エンジンは60%以上も間違った情報を引用。有料版は無料版より自信を持って間違えやすい（生成AIクローズアップ）
https://www.techno-edge.net/article/2025/03/24/4199.html

AI Search Has A Citation Problem
We Compared Eight AI Search Engines. They’re All Bad at Citing News.
March 6, 2025
https://www.cjr.org/tow_center/we-compared-eight-ai-search-engines-theyre-all-bad-at-citing-news.php
AI検索には引用の問題がある
8 つの AI 検索エンジンを比較しました。いずれもニュースの引用が下手です。

Generative AI’s Ability to Accurately Search and Cite News Content

In order to evaluate the ability of generative AI to accurately search and cite news content, excerpts from ten randomly selected articles from each of 20 publishers were presented to eight generative search tools equipped with real-time search capabilities (ChatGPT, Perplexity, Perplexity Pro, Copilot, Gemini, DeepSeek, Grok 2, and Grok 3). Each chatbot was asked to identify the corresponding article's headline, original publisher, publication date, and URL.

The results showed that these chatbots collectively provided inaccurate responses to over 60% of the queries. Perplexity delivered incorrect responses to 37% of the queries, while Grok 3 exhibited a particularly high error rate of 94%.

Intuitively, these results—that the tools frequently produced incorrect responses—are understandable.

0 Comments

よろず知財コンサルティングのブログ