PatentScore: LLM生成特許クレームの多次元評価

30/7/2025

大規模言語モデル（LLM）を使って特許クレーム（請求項）を自動生成する技術が進化していますが、その生成結果を正確に評価する方法はまだ十分に確立されていません。
従来使われてきた自然言語生成（NLG）用の評価指標（BLEUやROUGEなど）は、特許文書に特有の「法律的」「技術的」「構造的な」要件には向いていません。
そこで、特許クレームの品質をより適切に評価するために開発されたのが「PatentScore」という新しい評価フレームワークです。
「PatentScore」について、生成AIに深掘りさせました。
Gemini(Google), ChatGPT(OpenAI)のDeep Research機能による調査結果を添付しましたので、ご参照ください。
なお、生成AIによる調査・分析結果は、公開された情報からだけの分析であり、必ずしも実情を示したものではないこと、誤った情報も含まれていることについてはご留意されたうえで、ご参照ください。

[Submitted on 25 May 2025]
PatentScore: Multi-dimensional Evaluation of LLM-Generated Patent Claims
https://arxiv.org/abs/2505.19345

PatentScore: A Multidimensional Evaluation Framework for LLM-Generated Patent Claims
While the technology for automatically generating patent claims using large language models (LLMs) is advancing, reliable methods for accurately evaluating the generated claims have not yet been fully established.
Traditional evaluation metrics used in natural language generation (NLG), such as BLEU and ROUGE, are not well-suited for the unique "legal," "technical," and "structural" requirements of patent documents.
To more appropriately assess the quality of patent claims, a new evaluation framework called “PatentScore” has been developed.
I conducted an in-depth investigation of “PatentScore” using generative AI.
Please refer to the attached research findings obtained via the Deep Research functionalities of Gemini (Google) and ChatGPT (OpenAI).
Note: The results of the AI-generated research and analysis are based solely on publicly available information. Please be aware that they may not reflect actual conditions and may contain inaccuracies.

0 Comments

よろず知財コンサルティングのブログ