To log in, click the teal "Login" button in the upper right-hand corner of this page. If you are logged in but still do not have access, please check your 2025 Annual Meeting registration.
Moderator
Yousuf Khalifa, MD
Panelists
Prasanna V Ramesh, MS, DNB
Tam Ch Hoang, MD
Viewing Papers
Expand a paper title to the right to view the paper abstract and authors. Use the video link to jump to that poster in the session.
Presenting Author
Prasanna V Ramesh, MS, DNB
Purpose
An AI protractor tailored specifically for the intricate analysis of the iridocorneal angle was innovated by us for Anterion (high-resolution swept-source OCT) images.
Methods
This technology swiftly and precisely annotates angles with human-in-the-loop machine learning, meticulously measuring their degrees, and autonomously categorising them into wide open, narrow, and closed angles. Trained on a meticulously curated dataset comprising 1500 images, partitioned into 1370 for training and 130 for testing, this AI protractor exhibited a progressive increase in accuracy over time.
Results
In three distinct tests, the AI demonstrated exceptional sensitivity and specificity, showcasing its robust performance; Test 1 - Sensitivity 82.61% & Specificity 92.59%; Test 2 - Sensitivity 88.24% & Specificity 94.44%; and Test 3 - Sensitivity 95.24% & Specificity 95.83%.
Conclusion
This AI model also overcomes the black box dilemma thereby enhancing diagnostic precision with constant training via the feedback mechanism.
Presenting Author
Aisling Higham, FRCOphth, MBBS, MSc
Co-Authors
Ernest Lim MBBS, BSc, Nick De Pennington MBA, BM BCh, FRCS, Andrew Lee MD, Saif Aldeen AlRyalat MD, Sanjana Jaiswal MBBS
Purpose
Large-language models (LLMs) for healthcare lack scalable evaluation methods. We test the hypothesis that evaluation of LLM generated responses to patient queries can be automated using readability metrics and automated qualitative evaluations, ultimately supporting development of more usability and empathetic AI-driven clinical conversations.
Methods
A comparative study was conducted using 78 anonymized clinician-curated questions posed to Dora, a clinical agent used in post-cataract surgery follow-ups in the UK. Responses from 2 Texas-based ophthalmologists were compared with responses from three different LLMs using retrieval-augmented generation (RAG). Metrics compared response length, word complexity and readability scores in both groups. Chain-of-thought prompting was employed for LLMs to rate answers on a 1-5 likert scale for the following domains: patient concern management, empathy, and information provision. Two separate ophthalmologists blind-ranked responses from the LLM versus clinicians for each domain.
Results
Automated metrics revealed variations in response length and readability between LLM-generated and clinician-generated responses. LLM responses had fewer words overall (39%), and with shorter sentences (52%) than the clinicians�. They were also more readable than clinician responses, with an average Flesh Kincaid reading grade required of 5.6 versus 10.2 respectively. The LLMs� qualitative assessment demonstrated comparable abilities with a trend preference for clinicians for �managing concerns�, and �information provision�, and �empathy� for LLMs. Agreement between clinician ranking and the automated likert evaluation was 74%, comparable to inter-clinician ranking agreement of 75%.
Conclusion
We surface interesting differences in usability patterns between clinician and AI-generated responses to real-world, cataract patient post-op queries. This study supports the feasibility of using automated metrics for evaluating model responses, guiding the design and development of empathetic and usable AI systems.
Presenting Author
Mohammad Soleimani, MD
Purpose
Microbial keratitis (MK) is a major cause of corneal blindness, with outcomes dependent on timely, accurate diagnosis. In low and middle-income countries, limited access to expertise and equipment hampers diagnostics. This study investigates using deep learning (DL) to diagnose and differentiate MK subtypes from smartphone-captured images.
Methods
The dataset comprised 889 cases of bacterial keratitis (BK), fungal keratitis (FK), and acan-thamoeba keratitis (AK) collected from 2020 to 2023. A convolutional neural network-based model was developed and trained for classification.
Results
The study demonstrates the model's overall classification accuracy of 83.8%, with specific accu-racies for AK, BK, and FK at 81.2%, 82.3%, and 86.6%, respectively, with an AUC of 0.92 for the ROC curves.
Conclusion
The model exhibits practicality, especially with the ease of image acquisition using smartphones, making it applicable in diverse settings.
Presenting Author
Emine Esra Karaca, MD
Co-Authors
Feyza Dicle I??k MD, Ali Seydi Ke�eli PhD, Ayd?n Kaya PhD, Ozlem Evren Kemer MD
Purpose
In this study, a convolutional neural network (CNN) model was developed to automatically predict corneal health via anterior segment optic coherence tomography (AS-OCT) after DMEK surgery due to pseudophakic bullous keratopathy.
Methods
For model training, AS-OCT images from 28 patients who underwent DMEK surgery between January 2021 and June 2022 were used, covering 4 visits between the first and second postoperative years. The decision of healthy/unhealthy was based on criteria such as irregularity and thickening in Descemet's membrane and stroma, opacity/tissue mismatch at graft-host junctions, and corneal thickness. A total of 1208 images, on which the two physicians agreed, were analyzed. After quality filtering, 45% (544) were labeled as healthy, and 55% (654) as unhealthy. 70% of the dataset was used for training, and 30% for testing. 20% of the training set was reserved for validation.
Results
The proposed CNN includes an input layer for 50x50 grayscale images, a 2D convolution layer with 128 3x3 filters, a batch normalization layer, a max pooling layer with 2x2 filters, a 2D convolution layer with 128 1x1 filters, and a softmax layer. ReLU activation function was used in the layers. The ADAM optimization method and a mini-batch size of 25 were set. The maximum number of epochs was defined as 100.The average age was 67.9�10.1 years, with 64.3% (18) female and 35.7% (10) male. The proposed CNN model achieved a 92.20% accuracy rate in determining corneal status. The sensitivity value of the model was found to be 92.64%, and the specificity value was 91.84%.
Conclusion
The proposed CNN model helps in optimizing follow-up and treatment plans by accurately determining the postoperative DMEK corneal status, aiding in early diagnosis. By distinguishing healthy/unhealthy corneas, the model reduces clinicians' workload and minimizes human error, providing more consistent and objective results.
Presenting Author
Paula E Maturana, MD
Co-Authors
Daniela L�pez MSc, Claudia Goya MSc, Daniela Cabrerizo MD, Remigio Lopez-Sol�s PhD, Leonidas Traipe MSc, MD
Purpose
To describe ultrasound findings in meibomian glands (MG) and lacrimal glands(LG) of patients with DED due to MGD (meibomian gland dysfunction), SS (Sj�gren�s syndrome) and others AID (autoimmune diseases).
Methods
Descriptive study of 60 patients with DED diagnosis according to DEWS II criteria (19 patients with SS, 19 with AID and 22 with MGD) who underwent Doppler ultrasound of MG and LG in both eyes (24 and 70 MHz transducer). The results were analyzed by a radiologist sub-specialized in ultrasound and morphometric parameters were registered for both glands, such as: intra and inter-gland ultrasound density, vascularization characteristics and structural changes (diameter, thickness, and density) in acini and ducts.
Results
Sixty patients were enrolled, with a mean age of 49.3 � 12.70 (range: 16 - 77), and 88.67% were women. Regarding MG, intra- and inter-acinus echogenicity alterations were observed in all cases (primarily hypoechogenicity). Structural alterations were present in 86% of patients, and hypervascularization was noted in 95% (with a higher systolic peak in patients with AID, averaging 10.48 cm/s). For LG, echogenicity, structural alterations, and hypervascularization were observed in 88%, 10%, and 75% of cases, respectively, especially in SS patients.
Conclusion
Ultrasound evaluation offers valuable insights into the structural changes of LG and MG in DED. Alterations in gland echogenicity and hypervascularization were the most frequently findings. This technique can reveal adaptive changes and inflammatory effects that are not detectable by other methods and lead to the development of new research.
Presenting Author
Jonathan El-Khoury, MD
Co-Authors
Daniel Milad MD, Kenan Bachour MD, Mona Harissi-Dagher MD, FRCSC
Purpose
The study aims to evaluate the ability of ChatGPT-4 and ChatGPT-4o, to triage new consultations and acute ocular complaints at the ophthalmology emergency clinic at a major Canadian teaching hospital. The study will also assess the models' predictive accuracy for diagnoses based on the information provided in the consultation or triage request.
Methods
This retrospective cross-sectional study was conducted at a major Canadian teaching center between January and December 2024. The study includes 200 patients referred to the ophthalmology emergency clinic, either with new consultation requests or for known patients with acute ocular complaints. The primary outcome was ChatGPT's triage accuracy, determined by comparing ChatGPT's recommended time frame for seeing the patient with the actual time frame within which the patient was seen. Factors that may have influenced the primary outcome, such as the number of clinical elements in the consultation or triage request, diagnostic accuracy, and subspecialty of case, were also analyzed.
Results
TBD
Conclusion
TBD
To log in, click the teal "Login" button in the upper right-hand corner of this page. If you are logged in but still do not have access, please check your 2025 Annual Meeting registration.