Zhixuan Zhao, Oisin Mac Aodha, Carola Riccarda Daniel, Nicolas Israeliantz, Anna Orekhova, Tobias Schwarz, Richard Mellanby, Christopher J. Banks
Background
Middle ear disease is a common and potentially serious condition in dogs, often resulting in pain, hearing loss, and neurological symptoms. Computed tomography (CT) is a valuable diagnostic tool for assessing middle ear pathology, but radiologist workload and potential diagnostic delays highlight the need for automated tools. While deep learning (DL) has shown promise in human radiology, its use in veterinary imaging is limited. This study aimed to develop a convolutional neural network (CNN) capable of classifying canine middle ears as normal or diseased using CT images, testing whether such a model could perform effectively with a relatively small dataset by leveraging transfer learning and data augmentation.
Methods
This analytical study used 535 canine CT images collected at the Royal (Dick) School of Veterinary Studies, University of Edinburgh, between 2009 and 2020. Radiologists classified the images as normal or diseased based on the presence of fluid or soft tissue in the tympanic bulla. Data were split into training (74%), validation (13%), and test (13%) sets. The team employed a ResNet-18 CNN architecture, using both feature extraction and fine-tuning approaches with ImageNet-pretrained weights. Data augmentation techniques (flips, rotations, and brightness/contrast changes), oversampling, and class weighting were applied to mitigate data imbalance. Model performance was assessed using accuracy, precision, recall, specificity, F1 score, and AUC metrics.
Results
Among the ten tested models, the fine-tuned ResNet model with dynamic data augmentation, class weighting, and oversampling (FT_05) achieved the best results, with an overall accuracy of 84.7%, precision of 0.963, recall of 0.722, and specificity of 0.972. Compared with a baseline model trained from scratch, FT_05 improved accuracy by 8.9% and recall by 23.8%. Models using feature extraction without full fine-tuning performed less well, confirming the importance of adapting pretrained weights to CT image data. Data augmentation and class balancing significantly enhanced model performance.
Limitations
The study’s diagnostic labels were based on expert radiologists’ assessments rather than confirmatory tests (e.g., otoscopy or histopathology). Using only a single 2D CT slice per patient, rather than full 3D datasets, may have restricted diagnostic information. Head asymmetry and downsampling from DICOM to TIFF images introduced variability, though this was mitigated during training. Finally, the dataset was modest in size and limited to images from two CT machines.
Conclusions
This study demonstrates that deep learning can reliably classify canine middle ear disease from CT images, achieving over 80% accuracy even with a small dataset. Fine-tuning pretrained CNNs, combined with data augmentation and class balancing, markedly improves diagnostic performance. These findings highlight the feasibility of AI-assisted diagnostic support in veterinary radiology and underscore the value of developing larger annotated datasets for future model refinement.

CT images of canine middle ears. (A) Bilaterally normal air-filled middle ears, thin bulla walls, and partially visible tympanic membranes (arrows). (B) Bilaterally diseased middle ears in which the lumen is filled with soft tissue and mineralized material that is flush with the tympanic membrane (arrows) and moderately thickened bulla walls.
How did we do?
Disclaimer: The summary generated in this email was created by an AI large language model. Therefore errors may occur. Reading the article is the best way to understand the scholarly work. The figure presented here remains the property of the publisher or author and subject to the applicable copyright agreement. It is reproduced here as an educational work. If you have any questions or concerns about the work presented here, reply to this email.

