Introduction
The robust evolution of artificial intelligence with models such as ChatGPT shows promise in a variety of cases in healthcare. The study evaluates the validity of ChatGPT in answering patient queries following surgery for benign prostatic hyperplasia (BPH).
Materials and Methods
A dataset of post-operative questions was compiled from discharge instructions, online forums, and social media, covering various BPH surgical modalities (TURP, simple prostatectomy, laser enucleation, Aquablation, Rezum, Greenlight, Urolift, iTind). Two senior urology residents graded ChatGPT 3.5’s responses using pre-defined criteria, with discrepancies resolved by a third senior reviewer.
Results
-
A total of 496 questions were evaluated by 2 reviewers out of which 280 were excluded. Of the 216 graded responses:
-
78.2% were comprehensive and correct
-
9.3% were incomplete or partially correct
-
10.2% were misleading or contained a mix of accurate and inaccurate information
-
2.3% were completely inaccurate
-
The highest percentage of correct responses was noted with newer procedures (Aquablation, Rezum, iTIND) as compared to standard procedures (TURP, simple prostatectomy).
-
Insufficient context or inaccurate information (36.6%) were the most frequently observed mistakes.
Conclusion
The study reveals that though ChatGPT shows potential in delivering accurate post-operative instructions to patients undergoing BPH surgeries, the incomplete answers raise concerns about its clinical utility and shows that there is need for further research.
European Association of Urology, 21-24 March 2025, Madrid, Spain