Vision Transformer vs. ResNet-101: An Explainable Deep Learning Approach for Breast Cancer Detection in Ultrasound Images
编号:82访问权限:仅限参会人更新:2025-12-21 13:01:15浏览:25次拓展类型2
报告开始:暂无开始时间(Asia/Amman)
报告时间:暂无持续时间
所在会场:[暂无会议] [暂无会议段]
暂无文件
提示
无权点播视频
提示
没有权限查看文件
提示
文件转码中
摘要
Breast cancer remains a significant global health concern, where early and accurate diagnosis is paramount for improving patient survival rates. This paper presents a comparative analysis of two deep learning architectures, the Convolutional Neural Network (CNN) based ResNet-101 and the Vision Transformer (ViT), for the classification of breast ultrasound images into benign, malignant, and normal categories. Addressing the common challenge of limited data, we employed a data augmentation strategy to expand a benchmark dataset of 780 images to over 10,000 images, creating a robust training set. Both models were trained on this augmented dataset, achieving test accuracies of 98.64% for the Transformer model and 97.57% for Resnet-101 model. The result indicates that the ViT model achieved higher accuracy than the ResNet-101. Furthermore, the existing Deep learning models are black box models. To enhance model transparency and build clinical trust, Gradient-weighted Class Activation Mapping (Grad-CAM), an Explainable AI (XAI) technique, is utilized to generate visual heatmaps, highlighting the specific regions in the ultrasound images that were most influential in the models’ diagnostic decisions. The proposed model harnesses GPU-based parallel infrastructure.
关键词
Breast Cancer, Deep Learning, ResNet-101, Vision Transformer, Explainable AI, Grad-CAM
发表评论