Study Finds Potential Negative Implications of Large Language Models in Breast Imaging Classification
A recent study published in Radiology highlights the potential negative implications of using large language models like GPT-4 and Google Gemini in breast imaging classification. While these AI models have shown promise in certain tasks, they may fall short in more complex medical reasoning. The study compared the performance of LLMs with board-certified breast radiologists in assigning BI-RADS categories, revealing a lack of strong agreement. Lead author Dr. Andrea Cozzi stresses the importance of evaluating the limitations of generic LLMs, especially in scenarios where medical reasoning is critical. The findings emphasize the need for better regulation of LLMs in medical settings to ensure accurate classification of imaging reports and improve patient care.