#0397
Can AI chatbots deliver high quality and understandable information on penile cancer for patients and their families?
H. Lucas1, N. Sathianethan2
1Western
Health, Urology, Melbourne, Australia
2Austin Health, Urology, Melbourne, Australia
Introduction:
Penile cancer is a rare malignancy that often leads patients and their families to seek information online, including through artificial intelligence (AI) platforms. However, the quality, readability, and accuracy of AI-generated information on penile cancer remains limited and there are no prior studies to assess this concept. This study aims to evaluate the ability of AI chatbots to provide high-quality, understandable, and actionable information on penile cancer to patients and their families.
Material and methods:
We identified commonly searched questions on penile cancer using the American Cancer Society and complied a list of 22 questions. These were used as input prompts for four different AI chatbots: ChatGPT version 3.5, Perplexity, Chat Sonic, and Bing AI. The responses generated by each chatbot were evaluated using validated tools. The DISCERN instrument was used to assess the quality of information (scored 1 to 5), while the Patient Education Materials Assessment Tool (PEMAT) measured understandability and actionability (reported as percentages). Readability was assessed using the Flesch-Kincaid Readability Score, and misinformation was evaluated against established clinical guidelines using a 5-point Likert scale. Response length was measured by the median word count of each chatbot's output.
Results:
AI chatbots provided moderately high-quality information, with a median DISCERN score of 4 (IQR 3–5). Misinformation was minimal, with a median Likert score of 1 (IQR 1–1), indicating high accuracy. The median word count per output was 280 (IQR 165–325). The readability level was categorised as difficult, comparable to a college student and reflected by a median Flesch-Kincaid Readability Score of 47.3 (IQR 42.1–51.8). The understandability of responses was high, with a median PEMAT score of 82% (IQR 74%–89%) but the actionability of information was low, PEMAT score of 38% (IQR 35%–50%).