2024年9月12日に収録された「松田語録:LLMが出すアイデアは人間が出すアイデアよりもいいか?」では、2024年 9月6日に発表された論文「Can LLMs Generate Novel Research Ideas?A Large-Scale Human Study with 100+ NLP Researchers」を基に話がすすめられています。
人間が生成したアイデアとAIが生成したアイデアとAIが生成したアイデアを人間がさらに再評価したアイデアを比較しています。(人間だけ、AIだけ、AI+人間とこの3つでどれが良いかということを評価した) 松田語録:LLMが出すアイデアは人間が出すアイデアよりもいいか? https://www.youtube.com/watch?v=p0WnYM-P9AY Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers https://arxiv.org/pdf/2409.04109 Recent advancements in large language models (LLMs) have sparked optimism about their potential to accelerate scientific discovery, with a growing number of works proposing research agents that autonomously generate and validate new ideas. Despite this, no evaluations have shown that LLM systems can take the very first step of producing novel, expert-level ideas, let alone perform the entire research process. We address this by establishing an experimental design that evaluates research idea generation while controlling for confounders and performs the first head-to-head comparison between expert NLP researchers and an LLM ideation agent. By recruiting over 100 NLP researchers to write novel ideas and blind reviews of both LLM and human ideas, we obtain the first statistically significant conclusion on current LLM capabilities for research ideation: we find LLM-generated ideas are judged as more novel (p < 0.05) than human expert ideas while being judged slightly weaker on feasibility. Studying our agent baselines closely, we identify open problems in building and evaluating research agents, including failures of LLM self-evaluation and their lack of diversity in generation. Finally, we acknowledge that human judgements of novelty can be difficult, even by experts, and propose an end-to-end study design which recruits researchers to execute these ideas into full projects, enabling us to study whether these novelty and feasibility judgements result in meaningful differences in research outcome. 1 「AIサイエンティスト」: AIが自ら研究する時代へ August 13, 2024 https://sakana.ai/ai-scientist-jp/ Are the ideas generated by LLM better than those generated by humans? In the “Matsuda's Words: Are the ideas generated by LLM better than those generated by humans?” recorded on September 12, 2024, the discussion is based on the paper “Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers” published on September 6, 2024. It compares ideas generated by humans, ideas generated by AI, and ideas generated by AI that have been further reevaluated by humans. (It evaluated which of the three - humans only, AI only, or AI + humans - was the best).
0 Comments
Leave a Reply. |
著者萬秀憲 アーカイブ
October 2024
カテゴリー |