Meeting Web banner

CSPS Home CSPS Home Past & Future Meetings Past & Future Meetings

Back to 2024 Abstracts

Artificial Intelligence in Aesthetic Surgical Practice: Performance Validation of Bard, a Novel Large Language Model, in Managing Postoperative Patient Concerns & Complications Following Body Contouring
Jad Abi-Rafeh*2, Vanessa Mroueh1, Brian Bassiri-Tehrani3, Jacob Marks4, Roy Kazan2, Foad Nahai3
1Division of Plastic Surgery, University of Pittsburgh, Pittsburgh, PA; 2Division of Plastic, Reconstructive, and Aesthetic Surgery, McGill University, Montreal, QC, Canada; 3Plastic Surgery, Private Practice, Atlanta, GA; 4Manhattan Eye, Ear, and Throat Hospital, New York, NY

Large Language Models (LLM) have revolutionized the way humans interact with Artificial Intelligence (AI) technology, with marked potential for applications in aesthetic surgery. The present study evaluates the performance of Bard, a novel LLM, in identifying and managing postoperative patient concerns for complications following body contouring surgery.
The American Society of Plastic Surgeons' website was queried to identify and simulate all potential postoperative complications following body contouring across different acuities and severity. Bard's accuracy was assessed in providing a differential diagnosis, soliciting a history, suggesting a most-likely diagnosis, appropriate disposition, treatments/interventions to begin from home, and red-flag signs/symptoms indicating deterioration, or requiring urgent emergency department (ED) presentation.
Twenty-two simulated body contouring complications were examined. Overall, Bard demonstrated a 59% accuracy in listing relevant diagnoses on its differentials, with a 52% incidence of incorrect or misleading diagnoses. Following history-taking, Bard demonstrated an overall accuracy of 44% in identifying the most-likely diagnosis, and a 55% accuracy in suggesting the indicated medical dispositions. Helpful treatments/interventions to begin from home were suggested with a 40% accuracy, whereas red-flag signs/symptoms, indicating deterioration, were shared with a 48% accuracy. A detailed analysis of performance, stratified according to latency of postoperative presentation (<48hours, 48hoursâ€"1month, or >1month postoperatively), and according to acuity and indicated medical disposition, is presented herein.
Despite promising potential of LLMs and AI in healthcare-related applications, Bard's performance in the present study significantly falls short of accepted clinical standards, thus indicating a need for further research and development prior to adoption.
Back to 2024 Abstracts