CNNs vs. Hybrid Transformers for Brain Tumor Classification on the BRISC Dataset
DOI:
https://doi.org/10.31102/jatim.v6i1.3545Abstrak
Accurate and timely classification of brain tumors from Magnetic Resonance Imaging (MRI) is critical for effective treatment planning. The advent of deep learning has revolutionized medical image analysis, yet the performance of different model architectures is highly dependent on the quality of benchmark datasets and the specifics of the training methodology. This study presents a rigorous comparative analysis of four prominent deep learning architectures—ResNet18, EfficientNet-B0, MobileNetV3-Small, and the hybrid convolutional-transformer model MobileViTV2-100—for multi-class brain tumor classification. The models were trained and evaluated on the BRISC dataset, a large-scale, balanced collection of 6,000 T1-weighted contrast-enhanced MRI scans, comprising glioma, meningioma, pituitary, and no-tumor classes. Employing a 5-fold cross-validation protocol with a full fine-tuning strategy and robust regularization techniques, this study assesses models on both classification accuracy and computational efficiency. The results indicate that MobileViTV2-100, ResNet18, and EfficientNet-B0 achieve statistically comparable state-of-the-art performance, with mean test accuracies of 98.88%, 98.72%, and 98.72%, respectively. MobileNetV3-Small, while being the most parameter-efficient, demonstrated significantly lower accuracy at 96.94%. A key finding reveals a performance-efficiency paradox, where the largest model, ResNet18, exhibited the fastest inference latency (2.83 ms), challenging the conventional assumption that fewer parameters directly translate to greater speed. This comprehensive analysis underscores the strengths of hybrid architectures and provides critical insights into the practical trade-offs between model complexity, accuracy, and real-world deploy ability for clinical decision support systems.
Unduhan
Referensi
[2] A. B. Abdusalomov, M. Mukhiddinov, and T. K. Whangbo, “Brain Tumor Detection Based on Deep Learning Approaches and Magnetic Resonance Imaging,” Cancers (Basel), vol. 15, no. 16, 2023, doi: 10.3390/cancers15164172.
[3] X. Jiang and W. Yu, “Brain tumor classification based on SFFM-ResNet18,” in Proc.SPIE, Jul. 2025, p. 136643U. doi: 10.1117/12.3070774.
[4] R. R. Ali et al., “Learning Architecture for Brain Tumor Classification Based on Deep Convolutional Neural Network: Classic and ResNet50,” Diagnostics, vol. 15, no. 5, 2025, doi: 10.3390/diagnostics15050624.
[5] M. P. Kumar, D. Hasmitha, B. Usha, B. Jyothsna, and D. Sravya, “Brain Tumor Classification Using MobileNet,” in 2024 International Conference on Integrated Circuits and Communication Systems (ICICACS), 2024, pp. 1–7. doi: 10.1109/ICICACS60521.2024.10499117.
[6] Y. B. A. Sembiring and E. Indra, “MRI Image Classification Analysis of Brain Cancer Using ResNet18 and VGG16 Deep Learning Architectures,” INFOKUM, vol. 13, no. 05, pp. 1537–1547, Jul. 2025, doi: 10.58471/infokum.v13i05.2955.
[7] M. M. Haj Hashem Khani and F. Maleki Nodehi, “Brain Tumor Detection in MRI Images Using ResNet18 Convolutional Neural Network and Transfer Learning,” Transactions on Machine Intelligence, vol. 7, no. 4, pp. 269–275, 2024, doi: 10.47176/TMI.2024.269.
[8] S. R. Kempanna et al., “Revolutionizing brain tumor diagnoses: a ResNet18 and focal loss approach to magnetic resonance imaging-based classification in neuro-oncology,” International Journal of Electrical and Computer Engineering (IJECE); Vol 14, No 6: December 2024DO - 10.11591/ijece.v14i6.pp6551-6559, Dec. 2024, [Online]. Available: https://ijece.iaescore.com/index.php/IJECE/article/view/36023
[9] M. A. Rahman, M. B. Miah, Md. A. Hossain, and A. S. M. S. Hosen, “Enhanced Brain Tumor Classification Using MobileNetV2: A Comprehensive Preprocessing and Fine-Tuning Approach,” BioMedInformatics, vol. 5, no. 2, 2025, doi: 10.3390/biomedinformatics5020030.
[10] A. K. M. Masum et al., “Comparative Evaluation of Transfer Learning for Classification of Brain Tumor Using MRI,” in 2023 International Conference on Machine Learning and Applications (ICMLA), 2023, pp. 697–702. doi: 10.1109/ICMLA58977.2023.00102.
[11] C. Guo, Q. Zhou, J. Jiao, Q. Li, and L. Zhu, “A Modified MobileNetV3 Model Using an Attention Mechanism for Eight-Class Classification of Breast Cancer Pathological Images,” Applied Sciences, vol. 14, no. 17, 2024, doi: 10.3390/app14177564.
[12] R. R. Sharma, A. Sungheetha, M. Tiwari, I. A. Pindoo, V. Ellappan, and G. G. S. Pradeep, “Comparative Analysis of Vision Transformer and CNN Architectures in Medical Image Classification,” in Proceedings of the International Conference on Sustainability Innovation in Computing and Engineering (ICSICE 2024), Atlantis Press, 2025, pp. 1343–1355. doi: 10.2991/978-94-6463-718-2_112.
[13] Y. Zhang, “A Comparative Analysis Between CNNs and ViTs for MRI-based Brain Tumor Classification,” Highlights in Science, Engineering and Technology, vol. 124, pp. 30–37, Feb. 2025, doi: 10.54097/s64djm51.
[14] Z. Aboobacker, “A Comparative Analysis of CNN and Vision Transformer Architectures for Brain Tumor Detection in MRI Scans ,” Jul. 2025, Zenodo. doi: 10.5281/zenodo.15973756.
[15] A. M. Kocharekar, S. Datta, Padmanaban, and R. R, “Comparative Analysis of Vision Transformers and CNN-based Models for Enhanced Brain Tumor Diagnosis,” in 2024 3rd International Conference on Automation, Computing and Renewable Systems (ICACRS), 2024, pp. 1217–1223. doi: 10.1109/ICACRS62842.2024.10841744.
[16] S. Jraba, M. Elleuch, H. Ltifi, and M. Kherallah, “Comparative Analysis of CNNs and Vision Transformer Models for Brain Tumor Detection,” in Proceedings of the 17th International Conference on Agents and Artificial Intelligence - Volume 3: ICAART, SciTePress, 2025, pp. 1432–1439. doi: 10.5220/0013381900003890.
[17] J. Liu, X. Luo, D. Wang, F. Li, J. Li, and R. Lan, “MobileVitV2-Based Fusion of Vision Transformers and Convolutional Neural Networks for Underwater Image Enhancement,” in 2023 13th International Conference on Information Science and Technology (ICIST), 2023, pp. 195–204. doi: 10.1109/ICIST59754.2023.10367056.
[18] A. Fateh et al., “BRISC: Annotated Dataset for Brain Tumor Segmentation and Classification with Swin-HAFNet,” Jun. 2025. doi: 10.48550/arXiv.2506.14318.
[19] K. He, X. Zhang, S. Ren, and J. Sun, “Identity Mappings in Deep Residual Networks,” in Computer Vision – ECCV 2016, B. Leibe, J. Matas, N. Sebe, and M. Welling, Eds., Cham: Springer International Publishing, 2016, pp. 630–645.
[20] M. Tan and Q. Le, “EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks,” in Proceedings of the 36th International Conference on Machine Learning, K. Chaudhuri and R. Salakhutdinov, Eds., in Proceedings of Machine Learning Research, vol. 97. PMLR, Sep. 2019, pp. 6105–6114. [Online]. Available: https://proceedings.mlr.press/v97/tan19a.html
[21] A. Howard et al., “Searching for MobileNetV3,” in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324. doi: 10.1109/ICCV.2019.00140.
[22] S. Mehta and M. Rastegari, “Separable Self-attention for Mobile Vision Transformers,” 2022. doi: 10.48550/arXiv.2206.02680.
[23] A. Paszke et al., “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, Eds., Curran Associates, Inc., 2019. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2019/file/bdbca288fee7f92f2bfa9f7012727740-Paper.pdf
[24] I. Loshchilov and F. Hutter, “SGDR: Stochastic Gradient Descent with Warm Restarts.,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, 2017. [Online]. Available: https://arxiv.org/abs/1608.03983v5
[25] I. Loshchilov and F. Hutter, “Decoupled Weight Decay Regularization.,” in 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019, 2019. [Online]. Available: https://arxiv.org/abs/1711.05101
[26] H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond Empirical Risk Minimization,” in International Conference on Learning Representations, ICLR, 2018. [Online]. Available: https://arxiv.org/abs/1710.09412
[27] C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, “Rethinking the Inception Architecture for Computer Vision,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 2818–2826. doi: 10.1109/CVPR.2016.308.
[28] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A Simple Way to Prevent Neural Networks from Overfitting,” Journal of Machine Learning Research, vol. 15, pp. 1929–1958, Jun. 2014.
[29] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Inf Process Manag, vol. 45, no. 4, pp. 427–437, 2009, doi: https://doi.org/10.1016/j.ipm.2009.03.002.
[30] S. Chetlur et al., “cuDNN: Efficient Primitives for Deep Learning,” arXiv preprint arXiv:1410.0759, 2014, [Online]. Available: https://arxiv.org/abs/1410.0759
[31] N. Ma, X. Zhang, H.-T. Zheng, and J. Sun, “ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design,” in Computer Vision – ECCV 2018, V. Ferrari, M. Hebert, C. Sminchisescu, and Y. Weiss, Eds., Cham: Springer International Publishing, 2018, pp. 122–138.
[32] V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “Efficient Processing of Deep Neural Networks: A Tutorial and Survey,” Proceedings of the IEEE, vol. 105, no. 12, pp. 2295–2329, 2017, doi: 10.1109/JPROC.2017.2761740.
[33] Q. T. Ostrom et al., “CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2012–2016,” Neuro Oncol, vol. 21, no. Supplement_5, pp. v1–v100, Nov. 2019, doi: 10.1093/neuonc/noz150.
[34] J. G. Smirniotopoulos, F. M. Murphy, E. J. Rushing, J. H. Rees, and J. W. Schroeder, “Patterns of Contrast Enhancement in the Brain and Meninges,” RadioGraphics, vol. 27, no. 2, pp. 525–551, Mar. 2007, doi: 10.1148/rg.272065155.
[35] S. Bakas et al., “Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the BRATS Challenge,” arXiv e-prints, p. arXiv:1811.02629, Nov. 2018, doi: 10.48550/arXiv.1811.02629.
[36] S. H. Patel et al., “T2–FLAIR Mismatch, an Imaging Biomarker for IDH and 1p/19q Status in Lower-grade Gliomas: A TCGA/TCIA Project,” Clinical Cancer Research, vol. 23, no. 20, pp. 6078–6085, Oct. 2017, doi: 10.1158/1078-0432.CCR-17-0560.
[37] N. F. Othman and S. W. Kareem, “Enhancing Brain Tumor Classification Accuracy Using Deep Learning with Real and Synthetic MRI Images,” Zanco J Pure Appl Sci, vol. 37, no. 4, pp. 126–149, 2025, doi: 10.21271/ZJPAS.37.4.11.
[38] A. Iqbal, M. A. Jaffar, and R. Jahangir, “Enhancing Brain Tumour Multi-Classification Using Efficient-Net B0-Based Intelligent Diagnosis for Internet of Medical Things (IoMT) Applications,” Information, vol. 15, no. 8, 2024, doi: 10.3390/info15080489.
[39] A. Kaushal, R. Altman, and C. Langlotz, “Geographic Distribution of US Cohorts Used to Train Deep Learning Algorithms,” JAMA, vol. 324, no. 12, pp. 1212–1213, Sep. 2020, doi: 10.1001/jama.2020.12067.
[40] M. Nagendran et al., “Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies,” BMJ, vol. 368, p. m689, Mar. 2020, doi: 10.1136/bmj.m689.
[41] K. S. Choi, S. H. Choi, and B. Jeong, “Prediction of IDH genotype in gliomas with dynamic susceptibility contrast perfusion MR imaging using an explainable recurrent neural network,” Neuro Oncol, vol. 21, no. 9, pp. 1197–1209, Sep. 2019, doi: 10.1093/neuonc/noz095.
[42] J. Howard and S. Ruder, “Universal Language Model Fine-tuning for Text Classification,” in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), I. Gurevych and Y. Miyao, Eds., Melbourne, Australia: Association for Computational Linguistics, Jul. 2018, pp. 328–339. doi: 10.18653/v1/P18-1031.
[43] X. Liu et al., “A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis,” Lancet Digit Health, vol. 1, no. 6, pp. e271–e297, Oct. 2019, doi: 10.1016/S2589-7500(19)30123-2.
[44] P. Rajpurkar, E. Chen, O. Banerjee, and E. J. Topol, “AI in health and medicine,” Nat Med, vol. 28, no. 1, pp. 31–38, 2022, doi: 10.1038/s41591-021-01614-0.
_001.png)
1.png)

