Gastroenterological Disease Detection using Transformer-Based Medical Image Analysis: Evaluating ViT-B16 on Curated Colon Data

Saba Iftikhar; Muhammad Waqas Asif; Muhammad Ayaz; Anam Safdar Awan; Rabia Basry; Muhammad Suleman Shahzad; Syeda Warda; Muhammad Naeem

doi:10.64105/ghbweq62

Authors

Saba Iftikhar Department of Computer Science and Information Technology, Superior University Lahore, Pakistan Author
Muhammad Waqas Asif Department of Computer Science, University of Gujrat, Pakistan Author
Muhammad Ayaz Department of Computer Science, University of Management and Technology Lahore, Pakistan Author
Anam Safdar Awan Department of Computer Science and Information Technology, Superior University Lahore, Pakistan Author
Rabia Basry Department of Computer Science and Information Technology, Superior University Lahore, Pakistan Author
Muhammad Suleman Shahzad Department of Computer Science and Information Technology, Superior University Lahore, Pakistan Author
Syeda Warda Department of Computer Science and Information Technology, Superior University Lahore, Pakistan Author
Muhammad Naeem Department of Computer Science and Information Technology Superior University Lahore Pakistan Author

DOI:

https://doi.org/10.64105/ghbweq62

Keywords:

Curated Colon Dataset (CCD), Detection Subtle Anomalies, Gastrointestinal diseases, Prediction Pa- tient Outcome, Vision Transformer (ViT-B16).

Abstract

The early detection of gastroenterological dis- eases can improve both outcomes for patients and reduce the burden of diagnosis at late stages. Traditional models, including CNNs, have been limited in capturing complex patterns within medical imaging analysis datasets, resulting in the investigation into transformer architectures, such as the Vision Transformer, or ViT. However, the use of ViT models in medical image analysis for gastroenterological disease detection remains relatively underexplored. This study is intended to evaluate the effectiveness of the ViT- B16 variant in predicting patient outcomes and detecting subtle anomalies using the Curated Colon Dataset, or CCD. This dataset was trained and tested using the transformer- based model and also compared the performance of tra- ditional CNNs. The ViT-B16 reached the result of 99.5% accuracy, while ResNet and EfficientNet reached 91.3% and 92.5% accuracies, respectively. Precision, recall, and AUC had high values; in this case, the AUC was estimated to be around 0.99, which indicates accurate discrimina- tion between classes of diseases. Hence, the obtained results demonstrate that the ViT-B16 model has poten- tial for the medical diagnostics task,particularly classifi- cation and prediction of patient outcomes, with possible applicability in real-world clinical settings,where informed decision-making,explainable AI approaches, clinical inter- pretability remain essential.However, challenges, such as clinical data integration and ethical considerations in diag- nostics, alongside the need for multimodal image fusion and improvements in diagnostics within minority classes, emphasize areas for future work.

Gastroenterological Disease Detection using Transformer-Based Medical Image Analysis: Evaluating ViT-B16 on Curated Colon Data

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

How to Cite

Most read articles by the same author(s)

PJMCR